2019年9月9日
By 大卫·杰特 Vbrick的可管理性和分析产品管理总监
Spotlights

用人工智能视频改变工作场所

Artificial Intelligence (AI) refers to a broad set of approaches for allowing computers to mimic human abilities. 这与自动化不同, which is the process of creating hardware or software capable of conducting process-based tasks without human intervention.

现代人工智能基础

今天最常见的人工智能形式是机器学习, where massive amounts of data are "fed" into an algorithm in order to train it. 一次训练, the algorithm is able to identify and then categorize items in subsequent data feeds unassisted. 机器学习算法使用迭代过程, 所以当学习模型接触到新的数据时, 他们根据自己“学到的东西”进行调整." A key shortcoming of machine learning is the reliance on vast amounts of sample data in order to become accurate enough to use. Thus, current applications of Machine Learning are limited depending on sources of high quality input data.

Another AI discipline, and one most relevant to the application of AI to video is 计算机视觉. 在计算机视觉中, the goal is to interpret the visual elements of an image or video using Artificial Intelligence. 计算机视觉 may use either Machine Learning or Deep Learning techniques to accomplish this goal, and is the foundation of emerging technology applications such as Facial Recognition and automated vehicles. Teaching computers to process visual data just as a human would has proven much harder than simply connecting algorithms to cameras. Much of the challenge is rooted in an only basic understanding of how human vision actually works in order to replicate it. 尽管如此, 计算机视觉 is currently one of the most exciting facets of AI for business strategists, with 58% of purchase influencers beginning to plan computer vision investments in their enterprise technology portfolio within the next year according to Forrester.

视频AI的构建模块

Spoken words are a critical component of video and there are a number of ways that AI is helping interpret speech.

机转录: 这是人工智能最早的例子之一, where an algorithm is able to interpret voice data into a text transcript. 这项技术现在很普遍，甚至被应用到我们的智能手机中, but is also undergoing a renaissance thanks to innovative new deep learning techniques becoming available.

机器翻译: 一旦口语被消化成文本数据, 它开启了其他功能，比如翻译成其他语言. 这一领域的关键人工智能先驱之一是谷歌, 谁在2006年首次推出了他们的翻译服务, 使用联合国 & 以欧洲议会文本为基础语言资料. As of May 2017, Google supported over 100 languages and was serving 500 million people daily.

说话人识别: This is the ability of an AI to recognize the identity of a speaker based off their voice and speech patterns. A key dependency of this ability is an existing sample of the person’s voice to train the AI on.

光学字符识别(OCR): OCR是从视觉内容中识别文本的艺术, 例如嵌入演示幻灯片上的文本. The primary benefit of OCR in the business world is further enabling search engines to offer up visual content to users without over dependence on accurate and comprehensive metadata.

情绪分析: Another way to enrich text data is via an additional layer of 信息 called sentiment. This algorithm interprets dialogue to both identify and quantify affective states. Affective states are distinct from emotions as affective states are longer lasting mood states (such as anxiety or depression) which are the results of many events.

文本摘要: One of the newer text applications that will help build the next generation of video Artificial Intelligence is content summarization. This is when an algorithm is able to boil down hours of video into a concise text summary. Summarization algorithms will take into account the placement or emphasis of messages within a video.

要了解更多关于视频人工智能的基础知识，请阅读Vbrick博客。视频人工智能的基础."

超越口语和视频中的文本, AI promises to identify objects and actions to further enhance the value it can bring to video.

对象识别: 在机器学习算法消化了视频帧之后, the Object Recognition process identifies the various subjects within it. Object Recognition for an AI is a collection of related tasks and not the single step human vision perceives it as. 目标识别的关键要素包括图像分类, 对象定位, 最后是目标检测.

动作检测: One key advantage of video content is the ability to show instead of tell a story. 计算机视觉 advances are enabling AIs to decode what is being done and not just who is in it.

Combining Object Recognition with Action Detection will allow the analysis or prediction of why an object is committing an action. 该算法再次需要大量的训练来识别动作, 这个动作需要能被视觉检测到. The ability to guess an off-screen action has occurred still eludes AI observers.

The application of Artificial Intelligence is becoming much more commonplace and we are seeing the value it can bring to our personal and professional lives. As the use of live streaming and on-demand video continues to grow in the workplace, the addition of AI promises to exponentially increase how video can be used and the value it can bring in transforming how work is done and how workers communicate and collaborate.

To learn more about Video AI and see how Vbrick is implementing Video AI features into our product roadmap, 一定要注册参加我们的网络研讨会视频人工智能如何改变工作场所9月19日.

这篇文章是赞助内容