Some ideas #300
antoineeripret
started this conversation in
Ideas
Some ideas
#300
Replies: 1 comment 6 replies
-
@antoineeripret Thank you so much for the inputs. Always happy to get feedback/suggestions. My thoughts:
crawldf = pd.read_json('output_file.jl', lines=True)
urldf = adv.url_to_df(crawldf['url'])
crawldf['product'] = urldf[urldf['dir_1'].eq('producto')]
crawldf['shoes'] = urldf[urldf['dir_1'].eq('producto') & urldf['dir_2'].eq('shoes')]
crawldf['offers'] = urldf[urldf['url'].str.contains('oferta')] Or was there something else you wanted to achieve with this?
Let me know your thoughts, and we'll discuss further. Thanks again! |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @eliasdabbas!
Firstly, thank you for creating and updating a library that is now an essential for me. Really appreciate that.
I have a couple of features that I'd live to discuss with you. I'll try to keep my comment as short as possible and let me know what you think.
dict
for our architecture and have a new column using this segmentation. For instance, the following dictionary could be passed as an argument for theadv.crawl
method.client-secrets.json
key, we could add this information for a crawl or asitemap_to_df
method call. Using this library or something custom-built if you don't want to add dependencies.I'm not sure about GA, because GA4 API is still fresh and may be updated during the upcoming months.
sitemap_to_df
andurl_to_df
methods together with apd.merge
to have all the information I need. Won't it make sense to add the columns you already have in theurl_to_df
when you callsitemap_to_df
?Thanks in advance for your feedback!
Beta Was this translation helpful? Give feedback.
All reactions