使用 Selenium 从 Discogs URL 抓取数据时出现 NoSuchElementException

使用 Selenium 从 Discogs URL 抓取数据时出现 NoSuchElementException

我尝试使用 Selenium 从 Discogs URL 中提取一些数据,但担心我从 Selenium 中选择了错误的正确标签

我从网址

我尝试在控制台中获取此输出

Artista 1: The Sound Man Featuring Mercy (3) – The Factory
Testo elemento 1: The Factory (Original Mix)    
Testo elemento 2: The Factory (Bass Dub)    
Testo elemento 3: The Factory (Junior's Factory Dub)    
Testo elemento 4: The Factory (Sexapella)   
Testo elemento 5: The Factory (Klubb Kidz Flava Dub)    
Testo elemento 6: The Factory (Klubb Kidz School Dub)   
Testo elemento 7: The Factory (Duke's Massive Blast)

为了解决这个问题,我使用 Selenium 的 DevTools 查看了该部分,然后看到了这个

https://i.imgur.com/Q8Sbdk2.png

但我得到了这些错误

C:\Users\Peter\Desktop\script\BLOCCO 1\selenium>python canzonidiscogs.py
Inserisci l'URL di Discogs: https://www.discogs.com/it/master/103917-The-Sound-Man-Featuring-Mercy-The-Factory

DevTools listening on ws://127.0.0.1:59139/devtools/browser/b0724a48-9b6e-401f-8e58-7882ae487739
Artista 1: The Sound Man Featuring Mercy (3) – The Factory
Traceback (most recent call last):
  File "C:\Users\Peter\Desktop\script\BLOCCO 1\selenium\canzonidiscogs.py", line 21, in <module>
    artist = element.find_element(By.CSS_SELECTOR, 'td[class^="title_"]> a').text
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\selenium\webdriver\remote\webelement.py", line 417, in find_element
    return self._execute(Command.FIND_CHILD_ELEMENT, {"using": by, "value": value})["value"]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\selenium\webdriver\remote\webelement.py", line 395, in _execute
    return self._parent.execute(command, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 346, in execute
    self.error_handler.check_response(response)
  File "C:\Python311\Lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 245, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"td[class^="title_"]> a"}
  (Session info: chrome=115.0.5790.110); For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#no-such-element-exception
Stacktrace:
Backtrace:
        GetHandleVerifier [0x0069A813+48355]
        (No symbol) [0x0062C4B1]
        (No symbol) [0x00535358]
        (No symbol) [0x005609A5]
        (No symbol) [0x00560B3B]
        (No symbol) [0x00559AE1]

我使用此代码执行提取

from selenium.webdriver import Chrome
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Chiedi all'utente di inserire l'URL di Discogs
url = input("Inserisci l'URL di Discogs: ")

driver = Chrome()
wait = WebDriverWait(driver, 10)

driver.get(url)

title = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'h1[class^="title_"]'))).text
print(f"Artista 1: {title}")

# Utilizziamo il selettore CSS fornito per selezionare gli elementi della tabella
container = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div[class^="content_1TFzi"]')))

for i, element in enumerate(container, start=1):
    artist = element.find_element(By.CSS_SELECTOR, 'td[class^="title_"]> a').text
    print(f"Artista {i}: {artist}")

driver.quit()

答案1

你应该尝试这个方法:

from selenium.webdriver import Chrome
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = Chrome()
wait = WebDriverWait(driver, 10)

url = "https://www.discogs.com/it/master/103917-The-Sound-Man-Featuring-Mercy-The-Factory"
driver.get(url)

title = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'h1[class^="title_"]'))).text
print(f"Artista 1: {title}")

container = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'table[class^="tracklist_"]>tbody>tr')))

for i, element in enumerate(container, start=1):
    artist = element.text
    print(f"Testo elemento {i}: {artist}")

输出:

Artista 1: The Sound Man Featuring Mercy (3) – The Factory
Testo elemento 1: The Factory (Original Mix)
Testo elemento 2: The Factory (Bass Dub)
Testo elemento 3: The Factory (Junior's Factory Dub)
Testo elemento 4: The Factory (Sexapella)
Testo elemento 5: The Factory (Klubb Kidz Flava Dub)
Testo elemento 6: The Factory (Klubb Kidz School Dub)
Testo elemento 7: The Factory (Duke's Massive Blast)

相关内容