As a skilled data engineer, developer, and biomedical engineer, I have spent the last several years honing my expertise to offer top-tier services to clients. My specialty is in data engineering, including data warehousing, ETL optimization, data modeling, data governance, and data visualization. I also have a strong background in machine learning and MLOps. This has allowed me to develop a command of a wide range of technologies, including Hadoop, DataBricks, Docker, Terraform, Apache Spark, Collibra, big data, business intelligence tools, cloud computing (AWS, GCP, and Azure), TensorFlow, Keras, and Kubernetes. My preferred languages for data projects are Python, R, SQL, Rust, and Java. I'm also proficient in several other programming languages for full-stack development including JavaScript, TypeScript, HTML and CSS, among others. I've included examples of recent projects in my portfolio to showcase my capabilities.
Recently, I completed two competitive fellowships: the DS4 Data Engineering Fellowship and the Aspen Tech Hub Tech Policy Fellowship. The Aspen Tech Hub Tech Policy Fellowship focused on training expert tech leaders in public policy and regulations in San Francisco. During this fellowship, I learned about outputs like memos, briefs, whitepapers, action plans, and presented real technical solutions to government and institutional stakeholders.
During the DS4 Data Engineering Fellowship, I worked with over 32 million collective data records from the Center for Medicaid and Medicare Open Payment, Merit-Based Incentive Payment System (MIPS) Final Scores, and World Health Organization Global Healthcare Expenditures data. This experience led my team to be awarded the distinguished project award and receiving an honors certificate. To optimize data storage and processing efficiency, I implemented a strategic approach by converting raw files into Parquet format, a columnar storage file format. By leveraging the Snappy compression algorithm, I significantly reduced the memory needed to store the data by more than 83%. This transformation not only streamlined the data retrieval process but also enhanced the overall performance of data processing pipelines. My role involved developing sophisticated data models, designing efficient pipelines, and leveraging various tools such as PySpark, PyArrow, Pandas, Python, Airflow, and an AWS PostgreSQL instance.
As a medical device specialist, I have a wealth of experience in regulatory and quality control, as well as all areas of the product life cycle. I have worked on class 1, class 2, and class 3 devices in several areas, including software applications, dermatological, diagnostic biotech, diagnostic radiology, orthopedic, cardiac, urology, and physical therapy devices. I am skilled in completing regulatory documents such as design history folders, risk assessments, 510(k), DIOVV, quality plans, and verification/validation testing, and have included examples of these documents and public policy papers I have written in my portfolio.
I offer services in data engineering, machine learning, medical devices, and overall software development, and I am confident that my skills and experience can deliver results that exceed your expectations. Please feel free to contact me to discuss your project and how I can help you achieve your goals. Let's connect and make your project a success.