Skip to content

Commit 6ad496d

Browse files
committed
initial commit
0 parents  commit 6ad496d

File tree

14 files changed

+7884
-0
lines changed

14 files changed

+7884
-0
lines changed

PKG-INFO

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
Metadata-Version: 2.1
2+
Name: proxycurl-py
3+
Version: 0.1.0.post2
4+
Summary:
5+
Author: Nubela
6+
Author-email: [email protected]
7+
Requires-Python: >=3.7,<4.0
8+
Classifier: Programming Language :: Python :: 3
9+
Classifier: Programming Language :: Python :: 3.7
10+
Classifier: Programming Language :: Python :: 3.8
11+
Classifier: Programming Language :: Python :: 3.9
12+
Classifier: Programming Language :: Python :: 3.10
13+
Classifier: Programming Language :: Python :: 3.11
14+
Classifier: Programming Language :: Python :: 3.12
15+
Provides-Extra: asyncio
16+
Provides-Extra: gevent
17+
Provides-Extra: twisted
18+
Requires-Dist: Twisted (>=21.7.0,<22.0.0) ; extra == "twisted"
19+
Requires-Dist: aiohttp (>=3.7.4,<4.0.0) ; extra == "asyncio"
20+
Requires-Dist: gevent (>=22.10.2,<23.0.0) ; extra == "gevent"
21+
Requires-Dist: requests (>=2.25.0,<3.0.0) ; extra == "gevent"
22+
Requires-Dist: treq (>=21.5.0,<22.0.0) ; extra == "twisted"
23+
Description-Content-Type: text/markdown
24+
25+
# `proxycurl-py` - The official Python client for Proxycurl API to scrape and enrich LinkedIn profiles
26+
27+
* [What is Proxycurl?](#what-is-proxycurl-)
28+
* [Before you install](#before-you-install)
29+
* [Installation and supported Python versions](#installation-and-supported-python-versions)
30+
* [Initializing `proxycurl-py` with an API Key](#initializing--proxycurl-py--with-an-api-key)
31+
* [Usage with examples](#usage-with-examples)
32+
+ [Enrich a Person Profile](#enrich-a-person-profile)
33+
+ [Enrich a Company Profile](#enrich-a-company-profile)
34+
+ [Lookup a person](#lookup-a-person)
35+
+ [Lookup a company](#lookup-a-company)
36+
+ [Lookup a LinkedIn Profile URL from a work email address](#lookup-a-linkedin-profile-url-from-a-work-email-address)
37+
+ [Enrich LinkedIn member profiles in bulk (from a CSV)](#enrich-linkedin-member-profiles-in-bulk--from-a-csv-)
38+
+ [More *asyncio* examples](#more--asyncio--examples)
39+
* [Rate limit and error handling](#rate-limit-and-error-handling)
40+
* [API Endpoints and their corresponding documentation](#api-endpoints-and-their-corresponding-documentation)
41+
42+
## What is Proxycurl?
43+
44+
**Proxycurl** is an enrichment API to fetch fresh data on people and businesses. We are a fully-managed API that sits between your application and raw data so that you can focus on building the application; instead of worrying about building a web-scraping team and processing data at scale.
45+
46+
With Proxycurl, you can programatically:
47+
48+
- Enrich profiles on people and companies
49+
- Lookup people and companies
50+
- Lookup contact information on people and companies
51+
- Check if an email address is of a disposable nature
52+
- [And more..](https://nubela.co/proxycurl/docs#explain-it-to-me-like-i-39-m-5)
53+
54+
Visit [Proxycurl&#39;s website](https://nubela.co/proxycurl) for more details.
55+
56+
## Before you install
57+
58+
You should understand that `proxycurl-py` was designed with concurrency as a first class citizen from ground-up. To install `proxycurl-py`, *you have to pick a concurency model*.
59+
60+
We support the following concurrency models:
61+
62+
* [asyncio](https://docs.python.org/3/library/asyncio.html) - See implementation example [here](https://github.com/nubelaco/proxycurl-linkedin-scraper/blob/main/examples/lib-asyncio.py).
63+
* [gevent](https://www.gevent.org/) - See implementation example [here](https://github.com/nubelaco/proxycurl-linkedin-scraper/blob/main/examples/lib-gevent.py).
64+
* [twisted](https://twisted.org/) - See implementation example [here](https://github.com/nubelaco/proxycurl-linkedin-scraper/blob/main/examples/lib-twisted.py).
65+
66+
The right way to use Proxycurl API is to make API calls concurrently. In fact, making API requests concurrently is the only way to achieve a high rate of throughput. On the default rate limit, you can enrich up to 432,000 profiles per day. See [this blog post](https://nubela.co/blog/how-to-maximize-throughput-on-proxycurl/) for context.
67+
68+
## Installation and supported Python versions
69+
70+
`proxycurl-py` is [available on PyPi](https://pypi.org/project/proxycurl-py/). For which you can install into your project with the following command:
71+
72+
```bash
73+
# install proxycurl-py with asyncio
74+
$ pip install 'proxycurl-py[asyncio]'
75+
76+
# install proxycurl-py with gevent
77+
$ pip install 'proxycurl-py[gevent]'
78+
79+
# install proxycurl-py with twisted
80+
$ pip install 'proxycurl-py[twisted]'
81+
```
82+
83+
`proxycurl-py` is tested on Python `3.7`, `3.8` and `3.9`.
84+
85+
## Initializing `proxycurl-py` with an API Key
86+
87+
You can get an API key by [registering an account](https://nubela.co/proxycurl/auth/register) with Proxycurl. The API Key can be retrieved from the dashboard.
88+
89+
To use Proxycurl with the API Key:
90+
91+
* You can run your script with the `PROXYCURL_API_KEY` environment variable set.
92+
* Or, you can prepend your script with the API key injected into the environment. See `proxycurl/config.py` for an example.
93+
94+
## Usage with examples
95+
96+
I will be using `proxycurl-py` with the *asyncio* concurrency model to illustrate some examples on what you can do with Proxycurl and how the code will look with this library.
97+
98+
Forexamples with other concurrency models such as:
99+
100+
* *gevent*, see `examples/lib-gevent.py`.
101+
* *twisted*, see `examples/lib-twisted`.
102+
103+
### Enrich a Person Profile
104+
105+
Given a *LinkedIn Member Profile URL*, you can get the entire profile back in structured data with Proxycurl's [Person Profile API Endpoint](https://nubela.co/proxycurl/docs#people-api-person-profile-endpoint).
106+
107+
```python
108+
from proxycurl.asyncio import Proxycurl, do_bulk
109+
import asyncio
110+
import csv
111+
112+
proxycurl = Proxycurl()
113+
person = asyncio.run(proxycurl.linkedin.person.get(
114+
url='https://www.linkedin.com/in/williamhgates/'
115+
))
116+
print('Person Result:', person)
117+
```
118+
119+
### Enrich a Company Profile
120+
121+
Given a *LinkedIn Company Profile URL*, enrich the URL with it's full profile with Proxycurl's [Company Profile API Endpoint](https://nubela.co/proxycurl/docs#company-api-company-profile-endpoint).
122+
123+
```python
124+
company = asyncio.run(proxycurl.linkedin.company.get(
125+
url='https://www.linkedin.com/company/tesla-motors'
126+
))
127+
print('Company Result:', company)
128+
```
129+
130+
### Lookup a person
131+
132+
Given a first name and a company name or domain, lookup a person with Proxycurl's [Person Lookup API Endpoint](https://nubela.co/proxycurl/docs#people-api-person-lookup-endpoint).
133+
134+
```python
135+
lookup_results = asyncio.run(proxycurl.linkedin.person.resolve(first_name="bill", last_name="gates", company_domain="microsoft"))
136+
print('Person Lookup Result:', lookup_results)
137+
```
138+
139+
### Lookup a company
140+
141+
Given a company name or a domain, lookup a company with Proxycurl's [Company Lookup API Endpoint](https://nubela.co/proxycurl/docs#company-api-company-lookup-endpoint).
142+
143+
```python
144+
company_lookup_results = asyncio.run(proxycurl.linkedin.company.resolve(company_name="microsoft", company_domain="microsoft.com"))
145+
print('Company Lookup Result:', company_lookup_results)
146+
```
147+
148+
### Lookup a LinkedIn Profile URL from a work email address
149+
150+
Given a work email address, lookup a LinkedIn Profile URL with Proxycurl's [Reverse Work Email Lookup Endpoint](https://nubela.co/proxycurl/docs#contact-api-reverse-work-email-lookup-endpoint).
151+
152+
```python
153+
lookup_results = asyncio.run(proxycurl.linkedin.person.resolve_by_email(work_email="[email protected]"))
154+
print('Reverse Work Email Lookup Result:', lookup_results)
155+
```
156+
157+
### Enrich LinkedIn member profiles in bulk (from a CSV)
158+
159+
Given a CSV file with a list of LinkedIn member profile URLs, you can enrich the list in the following manner:
160+
161+
```python
162+
# PROCESS BULK WITH CSV
163+
bulk_linkedin_person_data = []
164+
with open('sample.csv', 'r') as file:
165+
reader = csv.reader(file)
166+
next(reader, None)
167+
for row in reader:
168+
bulk_linkedin_person_data.append(
169+
(proxycurl.linkedin.person.get, {'url': row[0]})
170+
)
171+
results = asyncio.run(do_bulk(bulk_linkedin_person_data))
172+
173+
print('Bulk:', results)
174+
```
175+
176+
### More *asyncio* examples
177+
178+
More *asyncio* examples can be found at `examples/lib-asyncio.py`
179+
180+
## Rate limit and error handling
181+
182+
There is no need for you to handle rate limits (`429` HTTP status error). The [library handles rate limits automatically with exponential backoff](https://github.com/nubelaco/proxycurl-linkedin-scraper/blob/main/proxycurl/asyncio/base.py#L109).
183+
184+
However, there is a need for you to handle other error codes. Errors will be returned in the form of `ProxycurlException`. The [list of possible errors](https://nubela.co/proxycurl/docs#overview-errors) is listed in our API documentation.
185+
186+
## API Endpoints and their corresponding documentation
187+
188+
Here we list the possible API endpoints and their corresponding library functions. Do refer to each endpoint's relevant API documentation to find out the required arguments that needs to be fed into the function.
189+
190+
| Function | Endpoint | API |
191+
| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- |
192+
| `linkedin.company.employee_count(**kwargs)` | [Employee Count Endpoint](https://nubela.co/proxycurl/docs#company-api-employee-count-endpoint) | [Company API](https://nubela.co/proxycurl/docs#company-api) |
193+
| `linkedin.company.resolve(**kwargs)` | [Company Lookup Endpoint](https://nubela.co/proxycurl/docs#company-api-company-profile-endpoint) | [Company API](https://nubela.co/proxycurl/docs#company-api) |
194+
| `linkedin.company.employee_list(**kwargs)` | [Employee Listing Endpoint](https://nubela.co/proxycurl/docs#company-api-employee-listing-endpoint) | [Company API](https://nubela.co/proxycurl/docs#company-api) |
195+
| `linkedin.company.get(**kwargs)` | [Company Profile Endpoint](https://nubela.co/proxycurl/docs#company-api-company-profile-endpoint) | [Company API](https://nubela.co/proxycurl/docs#company-api) |
196+
| `linkedin.person.resolve_by_email(**kwargs)` | [Reverse Work Email Lookup Endpoint](https://nubela.co/proxycurl/docs#contact-api-reverse-work-email-lookup-endpoint) | [Contact API](https://nubela.co/proxycurl/docs#contact-api) |
197+
| `linkedin.person.lookup_email(**kwargs)` | [Work Email Lookup Endpoint](https://nubela.co/proxycurl/docs#contact-api-work-email-lookup-endpoint) | [Contact API](https://nubela.co/proxycurl/docs#contact-api) |
198+
| `linkedin.person.personal_contact(**kwargs)` | [Personal Contact Number Lookup Endpoint](https://nubela.co/proxycurl/docs#contact-api-personal-contact-number-lookup-endpoint) | [Contact API](https://nubela.co/proxycurl/docs#contact-api) |
199+
| `linkedin.person.personal_email(**kwargs)` | [Personal Email Lookup Endpoint](https://nubela.co/proxycurl/docs#contact-api-personal-email-lookup-endpoint) | [Contact API](https://nubela.co/proxycurl/docs#contact-api) |
200+
| `linkedin.disposable_email(**kwargs)` | [Disposable Email Address Check Endpoint](https://nubela.co/proxycurl/docs#contact-api-disposable-email-address-check-endpoint) | [Contact API](https://nubela.co/proxycurl/docs#contact-api) |
201+
| `linkedin.company.find_job(**kwargs)` | [Job Listings Endpoint](https://nubela.co/proxycurl/docs#jobs-api-jobs-listing-endpoint) | [Jobs API](https://nubela.co/proxycurl/docs#jobs-api) |
202+
| `linkedin.job.get(**kwargs)` | [Jobs Profile Endpoint](https://nubela.co/proxycurl/docs#jobs-api-job-profile-endpoint) | [Jobs API](https://nubela.co/proxycurl/docs#jobs-api) |
203+
| `linkedin.person.resolve(**kwargs)` | [Person Lookup Endpoint](https://nubela.co/proxycurl/docs#people-api-person-lookup-endpoint) | [People API](https://nubela.co/proxycurl/docs#people-api) |
204+
| `linkedin.company.role_lookup(**kwargs)` | [Role Lookup Endpoint](https://nubela.co/proxycurl/docs#people-api-role-lookup-endpoint) | [People API](https://nubela.co/proxycurl/docs#people-api) |
205+
| `linkedin.person.get(**kwargs)` | [Person Profile Endpoint](https://nubela.co/proxycurl/docs#people-api-person-profile-endpoint) | [People API](https://nubela.co/proxycurl/docs#people-api) |
206+
| `linkedin.school.get(**kwargs)` | [School Profile Endpoint](https://nubela.co/proxycurl/docs#school-api-school-profile-endpoint) | [School API](https://nubela.co/proxycurl/docs#school-api) |
207+
| `linkedin.company.reveal` | [Reveal Endpoint](https://nubela.co/proxycurl/docs#reveal-api-reveal-endpoint) | [Reveal API](https://nubela.co/proxycurl/docs#reveal-api) |
208+
| `get_balance(**kwargs)` | [View Credit Balance Endpoint](https://nubela.co/proxycurl/docs#meta-api-view-credit-balance-endpoint) | [Meta API](https://nubela.co/proxycurl/docs#meta-api) |
209+

0 commit comments

Comments
 (0)