我试图刮images
他们会给我的23 images
但我不想申请limit
他们只会给我 10 张图片你能帮我解决这些问题吗
import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl='https://twillmkt.com'
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://twillmkt.com/collections/denim')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='ProductItem__Wrapper')
productlinks=[]
for links in tra:
for link in links.find_all('a',href=True):
comp=baseurl link['href']
productlinks.append(comp)
data = []
for link in set(productlinks):
r =requests.get(link,headers=headers)
soup=BeautifulSoup(r.content, 'html.parser')
up = soup.find('div',class_='Product__SlideshowNavScroller')
for e,pro in enumerate(up):
t=pro.find('img').get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
df = pd.DataFrame(data)
df.image=pd.Categorical(df.image,categories=df.image.unique(),ordered=True)
df = df.pivot(index='id', columns='image', values='link').reset_index().fillna('')
df.to_csv('kj.csv')
uj5u.com热心网友回复:
将影像的结果集切片 [:10]
...
up = soup.select('div.Product__SlideshowNavScroller img')[:10]
for e,pro in enumerate(up):
t=pro.get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
...
如果您想从 1 而不是 0 开始命名影像:
...
up = soup.select('div.Product__SlideshowNavScroller img')[:10]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':t.split('=')[-1], 'image':'Image ' str(e) ' UI','link':t})
...
编辑
好的明白了 - 行为不是基于影像数量,这里的问题是 id 不是唯一的,它不是产品的 id/sku。
怎么修?
让我们从产品中选择 sku 并将其用作资料帧中的 id:
sku = soup.select_one('.oos_sku').text.strip().split(' ')[-1]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':sku, 'image':'Image ' str(e) ' UI','link':t})
例子
import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl='https://twillmkt.com'
headers ={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://twillmkt.com/collections/denim')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('div',class_='ProductItem__Wrapper')
productlinks=[]
for links in tra:
for link in links.find_all('a',href=True):
comp=baseurl link['href']
productlinks.append(comp)
data = []
for link in set(productlinks):
r =requests.get(link,headers=headers)
soup=BeautifulSoup(r.content, 'html.parser')
up = soup.select('div.Product__SlideshowNavScroller img')
sku = soup.select_one('.oos_sku').text.strip().split(' ')[-1]
for e,pro in enumerate(up, start=1):
t=pro.get('src')
data.append({'id':sku, 'image':'Image ' str(e) ' UI','link':t})
df = pd.DataFrame(data)
df.image=pd.Categorical(df.image,categories=df.image.unique(),ordered=True)
df = df.pivot(index='id', columns='image', values='link').reset_index().fillna('')
df#.to_excel('test.xlsx')
输出
ID | 图 1 用户界面 | 图 2 用户界面 | 图 3 用户界面 | 图 4 用户界面 | 图 5 用户界面 | 图 6 用户界面 | 图 7 用户界面 | 图 8 用户界面 | 图 9 用户界面 | 图 10 用户界面 | 图 11 用户界面 | 图 12 用户界面 | 图 13 用户界面 | 图 14 用户界面 | 图 15 用户界面 | 图 16 用户界面 | 图 17 用户界面 | 图 18 用户界面 | 图 19 用户界面 | 图 20 用户界面 | 图 21 用户界面 | 图 22 用户界面 | 图 23 用户界面 | 图 24 用户界面 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | LOTFEELPJ023-30 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-2_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-3_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-4_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-5_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-6_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-7_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-8_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-9_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-10_160x.jpg?v=1631812617 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/Blue-Ripped-Knee-Distressed-Skinny-Denim-11_160x.jpg?v=1631812617 | |||||||||||||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
47 | LOTFEELPJ564-S-BRN | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_16_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_17_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_22_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_15_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_6_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/LOTFEELPJ564_9_160x.jpg?v=1639467815 | //cdn.shopify.com/s/files/1/0089/7912/0206/products/sizechart-stretch-pants_3_ec7e0b0c-1043-4306-a766-33f7e0b3edc8_160x.png?v=166994677 |
0 评论