What are the ImportFromWeb features?
ImportFromWeb comes with some great features for a powerful and efficient web scraping from Google Sheets, including:
- Cache control to always keep control over your ImportFromWeb usage
- Javascript rendering: ImportFromWeb successfully scrapes data even from complex web pages, such as JavaScript-rendered websites
- Location based content scraping, so you decide the IP address to scrape data from
- Mass web scraping, so you scrape seamlessly hundreds or thousands of URLs at the same time
- Automatic updates, so you update your data on a regular basis (even when you’re offline)
- Function monitoring to control and understand the data scraped
- Auto-refresh, so you don’t have to relaunch the =IMPORTFROMWEB() functions manually
- IP Rotation: ImportFromWeb uses proxy servers to fetch web pages from different IP addresses so that pages load correctly, every time
- Number formatting: ImportFromWeb automatically applies number formatting to data, including currency, date and time
- Templates catalog, so you can enjoy all ImportFromWeb’s ready-to-use solutions designed for Amazon, Google Maps, Google Search, YouTube, Instagram, Yahoo Finance, …
What websites can I scrape with ImportFromWeb?
With ImportFromWeb, you have the power to scrape data from any website! Whether it’s a simple static page or a dynamic site loaded with JavaScript, ImportFromWeb can handle it all.
Using XPaths or CSS selectors, you can easily extract the data you need from any website. From e-commerce platforms to news sites, blogs, and more, the possibilities are endless. Just specify the elements you want to scrape, and ImportFromWeb will do the heavy lifting for you.
However, it’s important to note that ImportFromWeb is designed to scrape data from public pages only. Please respect the website’s terms of service and ensure that you’re scraping within legal and ethical boundaries.
Can I extract data from a page that requires login?
ImportFromWeb doesn’t have the ability to login to a website.
However, users with technical skills may be able to use the function to extract data form private pages:
Like ImportJSON, ImportFromWeb accepts a cURL request as first parameter. You might be able to copy the full cURL request from the Chrome Developers tool and use it in your IMPORTFROMWEB function.
Beware that, by signing up to a platform, you may have accepted the platform’s terms and conditions. Check carefully that using a tool like ImportFromWeb
Can I scrape data from JavaScript-rendered websites?
ImportFromWeb is designed to handle JavaScript-rendered websites with ease.
With the js_rendering option, you can effortlessly scrape data from websites that rely on JavaScript to load and display content. It’s a powerful feature that enables you to access even more data-rich websites and gather the information you need. If you’re wondering how to use this feature, we’ve got you covered! Check out our comprehensive guide on how to retrieve content from JavaScript-rendered websites.
Do I need to know XPaths to use ImportFromWeb?
No! You can use CSS selectors instead.
While XPaths offer more flexibility and allow more complex queries, CSS selectors are known by most people who have basics in HTML/CSS.
If you are unsure of how to use CSS selectors, you can read this excellent guide from CSS-Tricks
When is data updated?
Your data is updated when the cache expires or when you choose to update it either with a manual update using the RUN button or through a trigger to automate it.
Check out our full guide on how to control when the data updates.
What is Cache control?
Fetching the same content constantly is not efficient. That is why ImportFromWeb caches the source code once loaded with the expected data.
By default, ImportFromWeb cache has a lifetime of 24 hours. Some plans allow to customize this value with the cacheLifespan option.
Can I scrape location-based content?
Yes, you can! With the powerful options of the IMPORTFROMWEB() function, you can scrape location-based content using the country_code option. This allows you to control the location, and our bots will fetch the webpage from an IP address of the specified country. For more details, check out our guide on how to scrape location-based content.
Does ImportFromWeb scrape from my IP address?
Rest assured, ImportFromWeb does not scrape websites from your IP address.
We prioritize your privacy and employ a sophisticated system that utilizes proxy servers. These servers fetch web pages on your behalf from various IP addresses that rotate regularly. This means that the website you’re scraping won’t have access to your personal IP address.
Can I be blocked by a website for scraping it?
When you use ImportFromWeb, your IP address remains completely anonymous.
How? ImportFromWeb cleverly employs proxy servers to fetch web pages from different IP addresses that rotate regularly. This ensures that your scraping activities stay under the radar and helps minimize the risk of getting blocked by websites.
What is the difference between ImportFromWeb and ImportXML?
ImportFromWeb is a user-friendly alternative to Google Sheets’ native IMPORTXML function, offering enhanced functionality and resources for web scraping.
- Web scraping efficiency: ImportFromWeb works seamlessly with hundreds of formulas per sheet, while IMPORTXML may become slow and buggy with multiple formulas.
- Website compatibility: ImportFromWeb allows you to scrape any website, including those rendered in JavaScript, whereas IMPORTXML doesn’t support JavaScript-rendered sites.
- Single formula for multiple URLs: ImportFromWeb enables you to use one formula to extract data from multiple URLs, while with IMPORTXML, you need a separate formula for each URL.
- CSS Selectors and xPath Queries: ImportFromWeb supports both CSS selectors and xPath queries, providing flexibility in data extraction. IMPORTXML only accepts xPath queries.
- Ready-made solutions: ImportFromWeb offers ready-made solutions for popular websites like Amazon, Google, and Yahoo Finance, simplifying the scraping process. IMPORTXML requires users to find the xPath queries themselves.
- Automation and monitoring: ImportFromWeb allows you to schedule updates and monitor data automatically, ensuring timely and accurate scraping results.
Overall, ImportFromWeb provides a more powerful and user-friendly experience for web scraping, making it an excellent alternative to IMPORTXML.
Is there a way to extract emails from websites?
Absolutely! With ImportFromWeb, you can extract emails from websites using our list of generic selectors, which includes emails.
Here’s the formula you have to write: =IMPORTFROMWEB("https://www.example.com", "emails")
Just bear in mind that while these selectors are designed to work with a wide range of websites, the success of email extraction may vary depending on whether the websites follow general guidelines for design and data presentation.
Another way of getting emails is to use the IMPORTFROMGOOGLE function to perform a boolean search and extract the emails: =IMPORTFROMGOOGLE("search_terms","emails")
