In the wave of artificial intelligence, large models have become the cornerstone of many advanced applications due to their powerful learning and reasoning abilities. However, these models often require expensive hardware support, especially high-performance GPUs, which discourages many researchers and developers. But now, JittorLLMs, launched by the domestic company Fitten, is changing this situation and making local deployment of large models within reach.
Project website: https://github.com/Jittor/JittorLLMs
- Low cost, high performance:
JittorLLMs is an inference library designed specifically for large models, and its core advantage lies in its ability to significantly reduce hardware requirements. Compared to traditional frameworks, JittorLLMs reduces hardware demands by 80%, allowing large models to run with just 2GB of memory even without a dedicated graphics card. This means that everyone can achieve local deployment of large models on ordinary machines without expensive hardware investments.
- Wide support and high portability:
JittorLLMs supports various large models, including ChatGLM, Pengcheng Panggu, BlinkDL's ChatRWKV, Meta's LLaMA/LLaMA2, MOSS, and more excellent domestic large models will be supported in the future. Through Jittor version of PyTorch (JTorch), users can achieve model migration and adapt to various heterogeneous computing devices and environments without modifying any code.
- Dynamic swapping technology, reducing development difficulty:
The dynamic swapping technology developed by the Jittor team is the world's first framework that supports automatic swapping of dynamic graph variables. Users do not need to modify any code, and tensor data can be automatically swapped between GPU memory, system memory, and hard disk, greatly reducing the difficulty of developing large models.
- Fast loading and computational performance:
Jittor framework reduces the loading overhead of large models by 40% and improves computational performance by more than 20% through zero-copy technology and automatic compilation optimization of meta-operators. In the case of sufficient GPU memory, JittorLLMs outperforms similar frameworks, and it can still run at a certain speed even with insufficient GPU memory or no graphics card.
JittorLLMs not only provides new possibilities for the deployment of large models but also opens up new paths for the popularization and application of artificial intelligence. With the continuous advancement of technology, we have reason to believe that artificial intelligence will become more inclusive and deeply integrated into our lives in the future.
Project website: https://github.com/Jittor/JittorLLMs