PredictLeads provides company intelligence data on 75M+ companies globally.
All PredictLeads data is proprietary and sourced directly from company websites.
All datasets are structured and point-in-time.
Please check the Guide section first for more detailed info on relevant datasets.
The data can be delivered via Flat Files, APIs or Webhooks.
When using:
Flat Files
Please review the Data Model section of relevant dataset.
PredictLeads team can provide you sample files in
JSONL (JSON Lines) format for each dataset.
APIs
Please review the
API Endpoints section.
Webhooks
Please review the
API Webhooks section.
Job Openings Dataset
Information about Job Openings sourced directly from company websites for highest accuracy and freshness.
Technology Detections Dataset
Information about approximately 800 million detections of companies that use various technologies.
Technologies Dataset
Information about approximately 25,000 technologies we track.
News Events Dataset
Structured and categorized News Events from some 18 million different blogs, PR sites and News sites.
Financing Events Dataset
Financing data on companies, most often this would include startup funding rounds. Extracted from
News Events data.
Connections Dataset
Categorized Connections between companies and where available also summarized context of how two
companies work together.
Website Evolution Dataset
Information on how websites are structured, what kind of subpages they have, categories of subpages
(e.g. blog, careers, about, terms...) and cleaned text content from these subpages.
Github Repositories Dataset
Information about public GitHub repositories of companies.
Startup Platform Posts Dataset
Information about posts on popular startup platforms.
Companies Dataset includes data such as Company Name, Meta Title, Meta Description, Structured Location Data, Ticker, Parent Company, Language and other basic company information.
object
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
string
|
string
|
string
|
string
|
string
|
array
|
object
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
boolean
nullable
|
object
|
parent_company
[optional]
object
|
object
|
|
|
redirects_to
[optional]
object
|
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
PredictLeads has historical jobs data since 2018 that includes over 180 million records, and is available
for 1.6 million companies, which includes an average of 7 million active jobs at any given time.
Job Openings are sourced directly from company websites which includes their career subpages, and ATS
integrations. All Jobs are categorized using industry-standard
O*NET codes.
The Job Openings Dataset includes fields such as Job Opening Title, Job Opening URL, First Seen At,
Last Seen At, Location, Category, Seniority, Description, Salary, Contract Type, O*NET Job Category Codes
and other Job Opening information.
object
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
|
|
|
|
|
|
Items
string
One of the values:
|
object
|
|
|
|
|
object
|
number
nullable
|
number
nullable
|
string
nullable
|
number
nullable
|
number
nullable
|
|
|
|
string
nullable
|
string
nullable
|
array
|
object
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
boolean
nullable
|
array
|
object
|
object
|
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
The Technology Detections Dataset provides an insight into the tech stack of a given company.
NOTE: If you're searching for general information about technologies please check the
Technologies Dataset
section.
Since 2018, PredictLeads has detected approximately 800 million technology adoptions for about
50 million companies.
The Technology Detection Dataset includes fields such as Technology Name, Technology ID, Ticker,
First Seen At, Last Seen At, Seen on Subpages Count and other Technology Detection information.
object
|
array
|
object
|
|
|
object
|
|
|
boolean
|
number
|
object
|
object
|
object
|
|
|
object
|
array
|
Items
object
|
|
|
object
|
array
|
Items
object
|
|
|
object
|
object
|
|
|
array
|
Items
|
One of
Discriminator property name:
"type"
|
meta
[optional]
object
|
string
|
|
|
Technologies dataset consists of some 25,000 technologies we track.
NOTE: If you're searching for technologies a given company is using please check the
Technology Detections Dataset
section.
Technologies are collected from script tags, IP ranges, cookies…, and also from job descriptions where companies mention them as
required skill sets. Subpages checked for technology data are of the following categories: main, login, support, trust, retail, blog, news, press and news outlet.
The Technologies dataset includes fields such as Technology Name, Technology ID, Description,
Category, Pricing Data and other Technology information.
object
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
array
|
Items
string
|
array
|
Items
string
|
string
nullable
|
string
nullable
Format:
uri
|
object
|
number
nullable
|
number
nullable
|
string
nullable
|
array
|
Items
string
|
string
Format:
date-time
Example:
|
object
|
object
|
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
Since 2016, PredictLeads has detected over 7 million relevant news signals, which are available for
2 million companies globally.
The News Events are sourced from some 20 million blogs, news and PR sites and distilled into relevant
categories by machine learning algorithms.
The News Events dataset includes fields such as Formatted Signal, Signal Category,
Most Relevant Source URL, Article Sentence, Article Body, Article Author, Found At, Effective Date and
other News Event information.
Property in schema:
category
Category | Description |
---|---|
acquires |
Group name: acquisition Company acquired another company. |
merges_with |
Group name: acquisition Company merges with another company. |
sells_assets_to |
Group name: acquisition Company sells assets (like properties or warehouses) to other company. |
signs_new_client |
Group name: contract Company signs new client. |
files_suit_against |
Group name: corporate_challenges Company files suit against other company. |
has_issues_with |
Group name: corporate_challenges Company has vulnerability problems. |
closes_offices_in |
Group name: cost_cutting Company closes existing offices. |
decreases_headcount_by |
Group name: cost_cutting Company lays off employees. |
attends_event |
Group name: expansion Company attends an event. |
expands_facilities |
Group name: expansion Company opens new or expands existing facilities like warehouses, data centers, manufacturing plants etc. |
expands_offices_in |
Group name: expansion Company expands existing offices. |
expands_offices_to |
Group name: expansion Company opens new offices in another town, state, country or continent. |
increases_headcount_by |
Group name: expansion Company offers new job vacancies. |
opens_new_location |
Group name: expansion Company opens new service location like hotels, restaurants, bars, hospitals etc. |
goes_public |
Group name: investment Company issues shares to the public for the first time. |
invests_into |
Group name: investment Company invests into other company. |
invests_into_assets |
Group name: investment Company buys assets (like properties or warehouses) from other company. |
receives_financing |
Group name: investment Company receives financing like venture funding, loan, grant etc. |
hires |
Group name: leadership Company hired new executive or senior personnel. |
leaves |
Group name: leadership Executive or senior personnel left the company. |
promotes |
Group name: leadership Company promoted existing executive or senior personnel. |
retires_from |
Group name: leadership Executive or senior personnel retire from the company. |
integrates_with |
Group name: new_offering Company integrates with other company. |
is_developing |
Group name: new_offering Company is developing a new offering. |
launches |
Group name: new_offering Company launches new offering. |
partners_with |
Group name: partnership Company partners with other company. |
receives_award |
Group name: recognition Company or person at the company receives an award. |
recognized_as |
Group name: recognition Company or person at the company receives recognition. |
identified_as_competitor_of |
Group name: relational New or existing competitor was identified. |
object
|
array
|
object
|
|
|
object
|
string
|
|
One of
|
|
|
string
|
boolean
|
|
|
|
|
|
Items
string
|
|
|
|
|
string
nullable
|
|
|
|
Items
string
|
|
|
|
Items
string
|
|
array
|
object
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
string
nullable
|
boolean
nullable
|
|
object
|
|
|
|
|
boolean
nullable
|
|
Items
string
|
|
|
object
|
company1
[optional]
object
|
object
|
|
|
company2
[optional]
object
|
object
|
|
|
object
|
object
|
|
|
array
|
Items
|
One of
Discriminator property name:
"type"
|
meta
[optional]
object
|
string
|
|
|
Deleted News Events.
object
|
array
|
object
|
|
|
meta
[optional]
object
|
string
|
|
|
Financing Events Dataset is the funding category from News Events extended into its own standalone
dataset that includes information about Venture Capital rounds and Private Equity investments.
The Financing Events Dataset includes fields such as Financing Type, Investors, Investment Amount,
Investment Date, Source URL, and other Financing Event information.
object
|
array
|
object
|
|
|
object
|
|
|
|
Items
string
One of the values:
|
|
|
|
|
array
|
Items
string
Format:
uri
Example:
|
object
|
object
|
object
|
|
|
object
|
array
|
Items
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
Since 2019, PredictLeads has detected over 180 million business connections between companies globally.
The Company Connections Dataset describes relationships between some 40 million companies.
Connections are sourced from Case Study pages, Testimonials, Our Customers sections, and is done through
image recognition of company logos.
The Connections Dataset includes fields such as Connection Category, Source Category, Source URL, Context,
First Seen At, Last Seen At, and other Connection information.
Property in schema:
category
We make sure to keep the direction of the category consistent, so the results can be read like a sentence:
Company1 is a {category} to Company2.
The special case is the other
category due to its non-specificity.
Category | Description |
---|---|
partner |
Read as: Company1 is a partner of Company2. A partner relationship signals a collaboration between companies that goes both in two directions. You will find such connections using titles as "We work with", "Our partners", "The company we keep", and other titles similar in meaning. The implications of this category can be quite wide in their meaning, but in general all partnerships are a positive signal. NOTE: Until February 2024 the partner category did not always signal a true partnership between two companies, but it also included a tighter knit relationship compared to the other category, which are not always as strong as the partner keyword suggests. The meaning is now correctly followed for all new cases or cases that have been last seen since and still have the partner category. |
vendor |
Read as: Company1 is a vendor to Company2. The vendor category is much simpler in its meaning. It simply means that company1 is a supplier for company2 in some way. All vendor relationships are positive, although some are more positive than others. On a webpage, such connections can be seen in lists such as "Our customers", "Trusted by", "Enabling businesses to", etc. For example, to be a vendor to Microsoft, the company has to pass certain requirements as to company size, product reliability, safety, etc. This can be seen as a vetting process of some sort and increases the company's trustworthiness. The vendor category can also be used to determine the supply chain of a certain company and evaluate supply chain risks. |
integration |
Read as: Company1 has an integration with Company2. Most of the time an integration happens between a platform and a service. Case examples:
|
investor |
Read as: Company1 is an investor in Company2. The investor category identifies relationships between companies making investments (Company1) and companies receiving investments (Company2). The more investors the company has, the better. One could also track what companies/sectors competitors are investing in. |
parent |
Read as: Company1 is a parent company of Company2. Parent category defines a parent-subsidiary relationship between websites. In some cases such websites are only localized versions for specific markets. Case examples:
NOTE: Often it makes sense to not only check current hierarchy level e.g. "blizzard.de" but also check their parent connections e.g. "blizzard.com" and "activisionblizzard.com". Since sometimes company connections (customers, investors…) are also available at higher company hierarchical levels. |
rebranding |
Read as: Company1 is a rebranding of Company2. Company1 being a rebranding of Company2 suggests that Company2 has undergone a process to change its brand identity. We detect rebranding based on domain change and similarities between the old and the new website. Case examples:
|
published_in |
Read as: Company1 is published in Company2. The published_in category is usually found on early stage startup websites with fewer accomplishments on the market, but positive press coverage. You can see such connections under titles such as "As seen in", "Featured in", "Talking about us", etc. A published_in connection is a positive signal, but after some months or years, we would expect to see them removed and replaced with other content proving the legitimacy of the company. |
other |
Read as: Company1 is connected to Company2. This is the most general category. If we detected a relationship between companies, but we were unsure which category it belonged to, or it didn't belong to any of our categories, then we would categorize it as other . Outgoing other category connections are a slightly positive signal, while incoming ones are quite more so. Thus a connection Company1 -> other -> Company2 is more positive for Company2, than for Company1. As Company2 was featured on Company1 website. Sometimes the other category can have additional information via the source_category attribute depending on where it was found. An example of such a source_category would be cookie_section , which is explained further below. Such connections could be evaluated differently. |
Property in schema:
source_category
The source_category
property provides additional context on where and how the connection was found.
object
|
array
|
object
|
|
|
object
|
string
|
One of
|
|
|
|
string
nullable
|
|
|
object
|
object
|
object
|
|
|
object
|
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
Since 2021, PredictLeads has detected some 470 million subpages on about 60 million websites,
with the goal to track how their websites are evolving through time.
The Website Evolution dataset tracks when subpages such as "about us", "blog", "careers", "api docs",
"customer support" etc. are added over time. The more subpages that are added in a shorter time-frame,
the faster the company grows.
object
|
array
|
object
|
|
|
object
|
|
|
|
string
|
string
nullable
|
object
|
object
|
object
|
|
|
array
|
object
|
|
|
object
|
string
|
string
nullable
|
string
nullable
|
meta
[optional]
object
|
string
|
|
|
Since 2021, PredictLeads has detected some 330,000 public GitHub repositories. GitHub Repositories Dataset covers about 65,000 companies, and provides insight into the frequency of their contribution into their public GitHub repository.
object
|
array
|
object
|
|
|
object
|
|
string
nullable
|
|
object
|
object
|
object
|
|
|
object
|
array
|
Items
object
|
|
|
array
|
Items
|
One of
Discriminator property name:
"type"
|
meta
[optional]
object
|
string
|
|
|
Provides information about posts on popular startup platforms by companies that are hiring or launching new products.
object
|
array
|
|
|
object
|
|
string
|
One of
|
|
string
|
string
nullable
|
boolean
|
meta
[optional]
object
|
string
|
|
|
Provides the current subscription's status and API usage limits.
This endpoint returns the current subscription's status, the maximum number of API requests allowed per month,
and the number of API requests made so far during the current month.
object
|
array
|
Items
object
|
|
|
object
|
|
integer
|
integer
|
meta
[optional]
object
|
string
|
Our API is RESTful and follows the JSON API specification.
For easier implementation, we provide API description also in
OpenAPI schema (version 3.1.0) JSON format
and OpenAPI SwaggerUI playground.
In order to use our API, you must first sign up
to get an authentication token.
You can see live status of our API services.
If you would like to receive data as Flat Files or via Webhooks,
please get in touch with our sales.
Authentication is done via the authentication token which you can find on
Your Subscription Plans page.
In order to use the API you need to call an endpoint URL with both your API token and API key.
The primary and recommended method for authenticating with PredictLeads is to specify the API key in the
HTTP request header using an extended header field.
Each account using the API has request limits implemented.
The amount of requests one can do each month is limited by the chosen plan. You can track your monthly usage on
Your Subscription Plans page.
Once the account reaches the limit, all further requests will produce a 402 HTTP error, notifying the user of
the reached limit. If you have any further questions regarding the limits, feel free to contact us via
support.
Responses with a list of objects are paginated. Use page
parameter to get more results.
Total number of all possible results is displayed inside meta
property as count
.
Follow the company.
POST
https://predictleads.com/api/v3/companies/{domain}/follow
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
Unfollow the company.
POST
https://predictleads.com/api/v3/companies/{domain}/unfollow
Path Parameters | |||
string
|
Returns Company.
GET
https://predictleads.com/api/v3/companies/{domain}
Path Parameters | |||
string
|
Returns a list of company's Job Openings.
GET
https://predictleads.com/api/v3/companies/{domain}/job_openings
Path Parameters | |||
string
|
|||
Query Parameters | |||
active_only
[optional]
boolean
|
|||
not_closed
[optional]
boolean
|
|||
|
|||
|
|||
|
|||
|
|||
with_description_only
[optional]
boolean
|
|||
with_location_only
[optional]
boolean
|
|||
categories
[optional]
array
|
|||
Items
string
One of the values:
|
|||
|
|||
|
Returns a single Job Opening.
GET
https://predictleads.com/api/v3/job_openings/{id}
Path Parameters | |||
|
Returns Technologies used by a specific Company as a list of Technology Detections.
GET
https://predictleads.com/api/v3/companies/{domain}/technology_detections
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
|
|||
|
Returns Companies using a specific Technology as a list of Technology Detections, ordered by date found, descending.
E.g. using this endpoint one can get a list of all companies using HubSpot or any other of the 25,000 technologies PredictLeads tracks.
The specific Technology ID can be obtained by querying the Retrieve all tracked Technologies endpoint.
GET
https://predictleads.com/api/v3/discover/technologies/{id}/technology_detections
Returns a single Technology.
GET
https://predictleads.com/api/v3/technologies/{id}
Path Parameters | |||
|
Returns a list of company's News Events.
GET
https://predictleads.com/api/v3/companies/{domain}/news_events
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
categories
[optional]
array
|
|||
Items
|
|||
One of
|
|||
|
|||
|
Returns specific News Event.
GET
https://predictleads.com/api/v3/news_events/{id}
Path Parameters | |||
|
Returns a list of company's Financing Events.
GET
https://predictleads.com/api/v3/companies/{domain}/financing_events
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
|
|||
|
Returns a list of company's Connections.
GET
https://predictleads.com/api/v3/companies/{domain}/connections
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
categories
[optional]
array
|
|||
Items
|
|||
One of
|
|||
|
|||
|
Returns a list of company's Website Evolution.
GET
https://predictleads.com/api/v3/companies/{domain}/website_evolution
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
|
|||
|
Returns a list of company's Github Repositories.
GET
https://predictleads.com/api/v3/companies/{domain}/github_repositories
Path Parameters | |||
string
|
|||
Query Parameters | |||
|
|||
|
|||
|
|||
|
Returns a list of latest posts on popular startup platforms by companies that are hiring or launching new products.
GET
https://predictleads.com/api/v3/discover/startup_platform_posts
Query Parameters | |||
|
|||
|
|||
post_types
[optional]
array
|
|||
Items
|
|||
One of
|
|||
|
|||
|
Provides the current subscription's status and usage limits
GET
https://predictleads.com/api/v3/api_subscription
Whenever we find a new data for one of the companies you are tracking, we wll send a POST request to
your general Webhook URL with the object.
Go to Your Subscription Plans page to enter
your Webhook URL.
To validate whether the webhook came from PredictLeads, we suggest verifying the webhook payloads with the
X-Predict-Signature header (which we pass with every webhook).
Header payload is a SHA1 HMAC hexdigest computed with your token and the raw body of the request.
Sends a list of created or updated company's News Events.
POST
{webhook_url}
Sends a list of deleted company's News Events.
DELETE
{webhook_deleted_url}