Component Data Engineer — Datasheet & Teardown Data Extraction for Electronics LCA
Worldwide
We build automated Life Cycle Assessment (LCA) tools for electronics supply chains. We need a Component Data Engineer to build the component ground-truth corpus that feeds our model training and validation. The contractor extracts and structures component-level data from electronics datasheets, public teardowns, and supplier disclosures: die size, package, process node, function class, manufacturer, and manufacturer-to-fab attribution. Our predictive models stay in-house; the contractor does the data-engineering work that feeds them. Scope: 1. Component identification corpus. Build a curated, sourced corpus of structured component data extracted from electronics datasheets at scale: die area, package type, process node, function class, manufacturer, and sub-part identifiers. Cover the component classes that dominate our customer BOMs (ICs, passives, connectors, magnetics) with full provenance per row. 2. Teardown data extraction. Mine public teardown sources (iFixit, FCC filings, repair guides, conference talks, vendor whitepapers) for component lists, die-area measurements, packaging details, and board-level layouts on products in our customer pipeline. Reconcile teardown observations against vendor datasheets where both exist. 3. Manufacturer-to-fab attribution. Build the mapping from manufacturer + part-family to foundry / fab where production occurred. Where direct disclosure is unavailable, document the inference path and a confidence level rather than asserting unknown. 4. Feature-engineering inputs. Deliver the structured feature columns the in-house models consume (no model work in scope): die-area distributions per node, package mix, manufacturer share, function-class share, and similar. Versioned releases on a defined cadence with changelogs. Required: - Hands-on data engineering with messy, unstructured inputs (PDFs, scans, datasheets, scraped HTML). - Familiarity with electronics components: ICs, passives, connectors, magnetics, and the basics of semiconductor packaging. - Comfortable reading electronics datasheets and pulling structured fields out of them at scale. - Python (or equivalent) for pipeline work; SQL for corpus storage and querying. - Strong data hygiene: sourcing, traceability, units, provenance. Nice to have: - Semiconductor packaging knowledge (BGA, QFN, WLCSP, flip-chip, die-on-leadframe). - Familiarity with process nodes and manufacturer-to-fab mappings (TSMC, Samsung, Intel, GlobalFoundries, SMIC, UMC). - LLM-assisted extraction experience (vision-language models on datasheets, structured-output prompting). - Familiarity with public teardown sources (iFixit, TechInsights, ChipRebel, FCC filings). - Familiarity with semiconductor LCA literature. Engagement: Fixed-scope project, approximately 9 months calendar time, starting 2026-09-01. Remote, EU time zone preferred. NDA and IP assignment under our standard contractor agreement.
$225,000.00
Fixed-price- ExpertExperience Level
- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:4 days ago
- Interviewing:8
- Invites sent:0
- Unanswered invites:0
About the client
- USAAustin2:36 PM
- $4.2K total spent3 hires, 2 active
- 2 hours
- Engineering & ArchitectureMid-sized company (10-99 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by