Technology GPT-4 is coming next week – and it will be multimodal, says Microsoft Germany

tom_mai78101

The Helper Connoisseur / Ex-MineCraft Host
Staff member
Reaction score
1,633
GPT-4 is coming next week: at an approximately one-hour hybrid information event entitled "AI in Focus - Digital Kickoff" on 9 March 2023, four Microsoft Germany employees presented Large Language Models (LLM) like GPT series as a disruptive force for companies and their Azure-OpenAI offering in detail. The kickoff event took place in the German language, news outlet Heise was present. Rather casually, Andreas Braun, CTO Microsoft Germany and Lead Data & AI STU, mentioned what he said was the imminent release of GPT-4. The fact that Microsoft is fine-tuning multimodality with OpenAI should no longer have been a secret since the release of Kosmos-1 at the beginning of March.

"We will introduce GPT-4 next week, there we will have multimodal models that will offer completely different possibilities – for example videos," Braun said. The CTO called LLM a "game changer" because they teach machines to understand natural language, which then understand in a statistical way what was previously only readable and understandable by humans. In the meantime, the technology has come so far that it basically "works in all languages": You can ask a question in German and get an answer in Italian. With multimodality, Microsoft(-OpenAI) will "make the models comprehensive".

Braun was joined by the CEO of Microsoft Germany, Marianne Janik, who spoke across the board about disruption through AI in companies. Janik emphasised the value creation potential of artificial intelligence and spoke of a turning point in time – the current AI development and ChatGPT were "an iPhone moment". It is not about replacing jobs, she said, but about doing repetitive tasks in a different way than before. One point that is often forgotten in the public discussion is that "we in Germany still have a lot of legacy in our companies" and "keep old treasures alive for years".


An excerpt explaining what GPT-4 is all about, taken from ChatGPT:

ChatGPT said:
"As an AI language model, I do not have any information on whether or when GPT-4 will be released, but I can explain what a multimodal model is.

A multimodal model is a type of AI model that can process and generate information across multiple modes or modalities, such as text, images, audio, and video. These models combine multiple sources of input data to make predictions or generate outputs that are more comprehensive and accurate than a model that processes only one modality.

For example, a multimodal model trained on both text and images can generate captions for an image that are more detailed and accurate than a model that only processes text or images separately. Similarly, a multimodal model that can process both audio and video can identify and classify objects in a video more accurately by taking into account both the visual and audio cues.

Overall, the ability to process multiple modalities makes multimodal models more versatile and effective in a wider range of applications, from natural language processing and computer vision to speech recognition and autonomous vehicles."
 
General chit-chat
Help Users
  • No one is chatting at the moment.

      The Helper Discord

      Members online

      No members online now.

      Affiliates

      Hive Workshop NUON Dome World Editor Tutorials

      Network Sponsors

      Apex Steel Pipe - Buys and sells Steel Pipe.
      Top