In your =IMPORTFROMWEB( ) function, the selectors describe the content you want to import.
Basically, there are 4 types of selectors
- Generic selectors
- Built-in selectors, i.e. selectors preconfigured for some specific websites
- XPaths
- CSS selectors
Generic selectors
Those selectors are common and will work with most websites. They describe some basic content any webpage has.
Selector | Definition |
---|---|
title | the page title |
h1 | the main headings |
h2 | secondary headings |
table | the first table |
table/x | the x table of the page (replace x by your chosen value) |
emails | all the emails contained into the page |
linkUrls | the link urls |
linkAnchors | the link anchors (i.e. the clickable text of the link) |
metaTitle | the title form the <meta> tag |
imageSources | the image sources (URL) |
imageAlternatives | the image descriptions form the image alt attributes |
metaKeywords | the keywords in the <meta> tag |
metaTwitter id | the id of the related Twitter account |
metaDescription | the descriptions in the <meta> tag |
Examples:
=IMPORTFROMWEB("https://website.com", "title") will import the page title of your URL
=IMPORTFROMWEB("https://website.com", "metaTitle,metaDescription") will import both the Meta Title and Description of your URL
Built-in selectors
Additionally, NoDataNoBusiness works hard to set up some websites and provide you with a list of preset selectors. For those websites you don’t need to look at the page source code. ; all you have to do is to pick up the selectors you are interested in!
Currently available platforms:
- Amazon product pages
- Google Search results pages
- Yahoo Finance
- and much more to come !
Feel free to reach out to us to order your specific selectors for the website you want to scrape data from
Work with XPaths
XPath (XML Path Language) is a query language for selecting nodes from an XML document. They can be used to describe the location of any element on a webpage.
As XPaths are usually a long string of keys, they can look a bit scary.
However, with the help of your Web browser, it can be very easy to find the right XPath for the piece of data you are looking for.
Find out more in this article:
Find an xPath with limited HTML knowledge
Work with CSS Selectors
CSS selectors are less powerful than XPaths to find complex elements in HTML pages, but they usually do the job and are also easier to build.
Select values in attributes
Standard CSS Selectors don’t allow to load values contained in attributes. With ImportFromWeb it is possible by indicating /attributeName
just after the element.
For example div a img/src
should return the source of the image.
Select an occurence
The CSS notation :nth-of-type()
that allows to select an occurence of a specific element can be written like /n
Example: table/2
Find out more in this article:
Introduction to CSS selectors