You will get any data from website


Project details
You will get all the data set you need and the script I used to scrape the data. With my experience in Data Science, I can understand what good data looks like, I do care to help client get their data to support their business.
Data Tool
PythonWhat's included
| Service Tiers |
Starter
$40
|
Standard
$70
|
Advanced
$100
|
|---|---|---|---|
| Delivery Time | 3 days | 15 days | 20 days |
Number of Sources Mined/Scraped | 1 | 3 | |
Number of Revisions | 2 | 4 | 5 |
Optional add-ons
You can add these on the next page.
Additional Revision
+$10About Nizar
Web Scraping & AI Automation | LLM Integration | Cloud
100%
Job Success
Bandung, Indonesia - 12:12 pm local time
Over the past 2+ years I’ve worked in two roles:
- **Data Engineer (6 months):** built robust web-scraping pipelines and ETL jobs on **GCP/AWS/Azure**, delivering clean, query-ready datasets.
- **AI Engineer (current):** integrate **LLMs** into real apps (incl. Google Meet add-ons), connect models to external data via scraping/RAG, and ship production features with monitoring.
**What I do**
- **Web Scraping & ETL:** Python (Requests, **Scrapy, Selenium/Playwright, DrissionPage**), anti-bot handling, rotating proxies, headless browsers, scheduling, retries, deduping, schema design, and **Parquet/CSV/SQL** outputs.
- **Pipelines & Cloud:** **GCP (Cloud Run, GCS, BigQuery), AWS (Lambda, S3), Azure**, Docker, CI/CD, cron, and cost-aware architecture.
- **AI/LLM Engineering:** OpenAI/Gemini, embeddings, **RAG**, prompt & tool design, vector stores, evaluation, and integration with external data sources.
- **Apps & Integrations:** FastAPI/Flask backends, event streams/SSE, webapps, and other platform integrations.
- **Data Products:** analytics tables, dashboards, validation, and documentation so teams can trust and reuse the data.
**Recent wins**
- **Built a security-proof web scraping system** with rotating proxies, headless browsers, and detection-avoidance logic—handling millions of rows weekly with consistent uptime.
- **Developed modular AI frameworks** integrating LLMs with dynamic data sources, supporting real-time reasoning and cross-platform interaction.
- **Successfully deployed and scaled projects in cloud environments** (GCP, AWS, Azure) with optimized performance, logging, and monitoring pipelines.
**Tech I use**
Python (Pandas, **BeautifulSoup**, Scrapy, Selenium/Playwright, Requests), FastAPI, SQL, BigQuery, PostgreSQL, Docker, Git/GitHub Actions, **GCP/AWS/Azure**, OpenAI/Gemini, LangChain basics, Vector DBs (Chroma/FAISS), Parquet, JSON, Linux.
**How I work**
- Clear scope → quick proof of concept → iterate with measurable milestones.
- Production-minded: retries, logging, idempotency, tests, and cost checks.
- Timezone: **UTC+7 (Asia/Jakarta)**; flexible for overlapping hours.
Let’s discuss your dataset or AI feature and agree on a small milestone to start.
_(Upwork prefers messaging inside the platform, if you want more flexibility please contact me here nizar.fathurohman@gmail.com)_
Steps for completing your project
After purchasing the project, send requirements so Nizar can start the project.
Delivery time starts when Nizar receives requirements from you.
Nizar works on your project following the steps below.
Revisions may occur after the delivery date.
Explore website
In this step, I explore how website structure and what method I should use to scrape the data.
Check data requirement
Test scraper if possible scrape the client's requirement data.


