Francesco Fusco - LLM and ML Systems Researcher

I am a Senior Research Scientist at IBM Research AI.
I work at the intersection of AI modeling and ML systems.

We build model architectures optimized for low latency inference, such as the pNLP-Mixer, and contribute to IBM products and IBM Granite models with synthetic data generation (SDG) technologies and RL pipelines for verifiable domains. Our tech stack allowed to top the BIRD Text2SQL leaderboard using an IBM model.

My passion is providing efficient and cost-effective solutions for complex problems via innovative algorithmic design and highly optimized implementations. I enjoy squeezing instructions in CPU cycles, designing algorithms tailored for specific hardware and be part of hardware software co-design processes.

My experience ranges from low level network programming in the Linux kernel, to high-performance programming using SIMD instructions and CUDA kernels, up to designing novel machine learning models optimized for low latency systems. I wrote software to monitor very large scale distributed systems (20k+ nodes) and cybersecurity solutions scaling to nation-wide deployments.

delivered the timeseries db powering many storage products of IBM (e.g., Storwize or IBM Sonas).
designed state-of-the-art indexing and compression algorithms (see Wikipedia page).
serial inventor, with 12 granted and 3 pending patents in AI/ML, compression, indexing and networking.
20+ published papers in top conferences in Artificial Intelligence (ACL, IJCAI, EMNLP), databases (VLDB), networking (CCR, IMC).
implemented very large scale distributed systems products, such as IBM Sonas, or Endace's monitoring products.
contributed to open-source software including PyTorch, the Linux kernel, Open vSwitch, ntop and Wireshark.

About Me