Extreme Algorithmization
Performance Engineering in CUDA HIP OpenCL Algorithms & Data Structures C/C++ SQL Python SIMD AVX-512 Raytracing Multi-threading HPC Low-latency High-frequency Trading Cache AArch64 ARM Assembly AI GPT4 OpenAI LLMы Transformers Blockchain Crypto-miners
Overview
Some of our engineers received EVT badge (Expert Vetted Talent - top 1% at Upwork). You can be sure that the best talents will be working on your problem. Out of ~30 million programmers worldwide, only a few thousand know Algorithms & Data Structures better than our engineers, which is proven by programming competitions. Please, contact us if you need that skill level (top 0.01%). We can do algorithmic/performance work in C/C++, Python, SQL, Java, MQL4, MQL5, C#, Assembly, JavaScript, and probably other languages. - With unique skills in Algorithms & Data Structures we improve programs asymptotically (often 100 or more times on large input data). - 5 to 15+ years of work experience Working for hire, we implemented: - efficient multi-threading, scaling real-world workloads almost linearly with the number of CPU cores (128x for AMD Ryzen Threadripper 3990X) - SIMD vectorization (SSE to AVX-512) and RTM (Restricted Transactional Memory) based acceleration, up to 16x improvement in computing thread or even copying - cache-aware algorithms: up to 50x improvement on some workloads - up to 20 trillion operations/second in CUDA on GTX1080 (thousands of times faster than CPU) - up to the theoretical limit (6.8 Gigarays/second on RTX 2080 laptop GPU) in ray-tracing with OWL and OptiX - up to 20x speedup for cryptocurrency miners using AVX512 and cache-friendly algorithms Programming languages: C++, C++11/14/17, C, Python, x86/x86_64/ARM/AArch64 assembly, LLVM IR, SQL, C#, JavaScript, HTML, CSS, MATLAB, Java, RDF, MQL4, MQL5, Delphi/Pascal, XML, Cypher. Libraries/Frameworks: PyTorch, Tensorflow, HuggingFace Transformers/Accelerate/Safetensors, Hivemind/Petals, OpenAI, tiktoken, Django, Flask, STL, LibSVM, .NET, XGBoost. Technologies: OpenMP, CUDA, SIMD (AVX&SSE, RTM), ASP.NET, WebForms, WinForms, Linux Kernel Modules, OptiX, OWL (OptiX Wrapper Library), RTX, raytracing. Theory/Principles/Know-how/Methodologies: Algorithms & Data Structures, Performance Optimization, Artificial Intelligence, Multithreading, Vectorization, Object-Oriented Programming, compiler implementation, linkers, Mathematics, Semantic Web, Scrum, Agile, Bayesian Learning, Blockchain. Open source code: Clang, LLVM, LLVM's compiler-rt library, a few our own repositories, Linux Kernel. Tools/APIs/Architectures/Platforms: Postgres, MSSQL, MySQL, Neo4j, MATLAB, HeidiSQL, Perforce, Fisheye, Confluence, Hudson, Jenkins, CMake, JIRA, SVN, GIT, QEmu, ARM, embedded, MT4, Nintendo Switch, MetaTrader 5, Anaconda, PyCharm. Virtual Machines: VMWare, VirtualBox, QEmu, Hyper-V. OSes: Windows, Linux, Android, Solaris.