About Me

I am a Senior Research Scientist at IBM Research AI in Zurich and I lead the activities around Efficient NLP. My mission is to identify opportunities to make NLP cost-effective in real-world deployments.

We are heavily invested in LLMs. We are building synthetic data generation pipelines to fine-tune small specialized models for specific use-cases. Some of our recent work has been demoed in NIPS and AAAI 2023.

As part of our work, we introduced novel model architectures enabling milliseconds latencies on CPUs and created encoders for terms that match the quality of sentence encoders at one tenth of the latency.

If you are interested in those topics and you want to join our efforts, please drop me a note!

My passion is providing efficient and cost-effective solutions for complex problems via innovative algorithmic design and highly optimized implementations. I enjoy squeezing instructions in CPU cycles, designing algorithms tailored for specific hardware and be part of hardware software co-design processes.

My experience ranges from low level network programming in the Linux kernel, to high-performance math programming using SIMD instructions, up to designing novel machine learning models. I wrote software to monitor very large scale distributed systems (20k+ nodes).

Some achievements:

  • wrote the timeseries db powering many storage products of IBM (e.g., Storwize or IBM Sonas).
  • designed indexing algorithms that are patented and used in products (see Wikipedia page).
  • 9 granted and 6 pending patents, out of which 9 in the Artificial Intelligence space.
  • 20+ published papers in top conferences in Artificial Intelligence (ACL, IJCAI, EMNLP), databases (VLDB), networking (CCR, IMC).
  • worked on very large scale distributed systems products, such as IBM Sonas, or Endace's monitoring products.
  • contributed to open-source software including the Linux kernel, ntopng, libpcap and wireshark.

You can fetch my curriculum vitae.