Generative AI - Image
Our work in visual AI serves as the perceptual system for our agents, allowing them to see and interpret the physical and digital world.
- Visual Generation & Understanding: Our capabilities include programmatic Image Generation and, critically, the use of Vision-Language Models (VLMs) for sophisticated visual reasoning tasks.
- Real-time Perception: We implement and fine-tune YOLO models for high-performance, real-time object detection and segmentation.
- Custom Model Development: Our work includes research into custom neural network architectures and the specific fine-tuning of YOLO models to push performance boundaries.
- Data Foundation: We are actively researching and developing a comprehensive Data Annotation pipeline to support the training of our VLMs, document parsing models, and other image models.
- VLM Optimization: We specialize in the fine-tuning and quantization of VLMs to enable their deployment on resource-constrained edge devices.
Generative AI - Video
Creating computational models and artificial intelligence tools designed to synthesize, modify, and generate video content from natural language descriptions and foundational imagery.
- Text-to-Video Synthesis: Generating original video sequences, conceptual simulations, or visual scenarios directly from descriptive natural language prompts.
- Algorithmic Video Modification: Utilizing AI models to alter the visual style, environment, or specific attributes of existing footage for research.

