WO2012079510A1 - Mute indication method and device applied to video conferencing - Google Patents

Mute indication method and device applied to video conferencing Download PDF

Info

Publication number
WO2012079510A1
WO2012079510A1 PCT/CN2011/084000 CN2011084000W WO2012079510A1 WO 2012079510 A1 WO2012079510 A1 WO 2012079510A1 CN 2011084000 W CN2011084000 W CN 2011084000W WO 2012079510 A1 WO2012079510 A1 WO 2012079510A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
terminal
sound
mcu
mute
Prior art date
Application number
PCT/CN2011/084000
Other languages
French (fr)
Chinese (zh)
Inventor
吴永明
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012079510A1 publication Critical patent/WO2012079510A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Definitions

  • a video conferencing system is a multimedia communication system that supports remote bidirectional transmission of sound and video. It is used to help users in different places to perform real-time two-way face-to-face visual communication.
  • Standards organizations such as the International Telecommunication Union (ITU), the Internet Engineering Task Force (IETF), and the Third Generation Partnership Project (3GPP) are each engaged in the development of multimedia standardization.
  • the ITU currently develops multiple multimedia communication standards such as ITU-T H.320, ITU-T H.323, and ITU-T H.324.
  • ITU-T H.320 is a multimedia communication application for narrowband circuit switched networks.
  • ITU-T H.323 is a multimedia communication application for IP networks
  • ITU-T H.324 is a multimedia communication application for very low-speed networks, such as PSTN (Public Switched Telephone Network) networks and mobile networks.
  • PSTN Public Switched Telephone Network
  • the IETF is responsible for the development of the Session Initiation Protocol SIP and multimedia conferencing standards based on this protocol.
  • 3GPP is responsible for the development of the IP Multimedia Subsystem IMS standard. It also develops an IMS network-based multimedia conferencing standard based on the IETF standard. This standard is very close to the SIP-based standard established by the IETF.
  • Figure 1 depicts the basic principles of video conferencing communication.
  • the terminal 101 is a device used by the user, including terminals l ⁇ n. Each terminal contains a codec, which is responsible for compression encoding and decoding of sound, video and other media; the terminal is also connected to a microphone, a camera, a display, a sound playing subsystem for completing sound and video input and output; A user input interface is also included, and the user inputs commands and information to the terminal through the input interface.
  • the terminal 101 establishes a connection with an MCU (Multipoint Control Unit) 102, including two-way communication of control signaling, audio, and video. To save network bandwidth, audio and video are generally in a compression-encoded format. Transfer on the network.
  • MCU Multipoint Control Unit
  • the MCU 102 is used to perform multi-party conference communication.
  • the terminal 101 participating in the multi-party conference communication establishes a connection with the MCU 102 to perform two-way communication of control signaling, audio, and video.
  • the MCU 102 is responsible for completing the exchange and mixing of the media streams.
  • the MCU 102 For the sound media stream, the MCU 102 generally outputs a sound-mixed sound media stream for each terminal 101, and the sound synthesis is generally superimposed by a plurality of sound media streams having the highest volume input.
  • the MCU 102 can send a single-picture video stream of another terminal for a certain terminal, if the MCU 102 supports multiple pictures.
  • the face function can also combine the video from multiple terminals into one multi-picture image and then send it to one or some terminals.
  • conference control functions are generally provided.
  • the conference control software 103 of Figure 1 is used to complete the conference control function.
  • An important function of the conference control software 103 is to perform mute control on the terminal.
  • the terminal that does not need to speak at present is usually muted, and after one terminal is muted, the other terminals participating in the same conference are engaged. The terminal's speech could not be heard.
  • a special tone is generally played by a mute terminal, for example, a "beep" tone is played at intervals.
  • a "beep" tone is played at intervals.
  • the disadvantage of this approach is that the prompts are not intuitive enough, and to some extent interfere with the listening of normal conference sounds.
  • the mute prompt in the related art adopts the prompt sound mode, it is not intuitive enough, and to some extent interferes with the problem of listening to the normal conference sound, and no effective solution has been proposed yet.
  • the present invention is directed to a method and apparatus for mute indication applied to a video conference, so as to solve the problem that the mute prompt adopts a prompt tone mode in the related art, which is not intuitive enough, and interferes to the listening of the normal conference sound to some extent.
  • a mute indication method applied to a video conference including: a multipoint conference unit (MCU) performs sound activation detection on an audio media stream sent by a terminal that has participated in a video conference and has been muted; The MCU obtains the detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivity state; when the detection result is a sound activation state, the MCU is sent to the The mute video indication is superimposed in the video signal of the terminal.
  • the MCU performs sound activation detection on the audio media stream sent by the terminal participating in the video conference, and the method includes: the MCU periodically performing sound activation detection on the audio media stream.
  • the acquiring, by the MCU, the detection result of the terminal includes: if the sound parameter of the audio media stream is higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound activation state. And if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound inactive state.
  • the MCU superimposes the mute video indication in a video signal sent to the terminal, where: the MCU superimposes a text or an icon in a video signal sent to the terminal, where the text or icon is used The terminal is instructed to be muted.
  • the superimposing the mute video indication in the video signal sent by the MCU to the terminal includes: repeating, by the MCU, superimposing the mute video indication on each video frame sent to the terminal, Until the mute video indication is cancelled.
  • a mute indication device for a video conference is provided, which is disposed in an MCU, and includes: a detection module configured to perform sound on an audio media stream sent by a terminal that has participated in a video conference and has been muted.
  • the detection module is configured to obtain the detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivity state; and an overlay module configured to activate when the detection result is sound In the state, the mute video indication is superimposed in the video signal transmitted to the terminal.
  • the detecting module is further configured to periodically perform sound activation detection on the audio media stream.
  • the acquiring module includes: a first determining submodule, configured to determine that the detection result is a sound activated state if a sound parameter of the audio media stream is higher than a threshold value of the sound activation detection; And determining, by the second determining submodule, if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection, determining that the detection result is a sound inactive state.
  • the superimposing module is further configured to superimpose a text or an icon in the video signal sent to the terminal, where the text or icon is used to indicate that the terminal is muted.
  • the superposition module is further configured to perform a repetitive process of superimposing the mute video indication on each video frame sent to the terminal until the mute video indication is cancelled.
  • the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. When the detection result is the sound activation state, the MCU superimposes the silence video indication in the video signal sent to the terminal. .
  • the purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient.
  • the advantage of the embodiment of the present invention is that the prompt information is intuitive, the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user.
  • FIG. 1 is a schematic diagram of a basic principle of a video conference communication according to the related art
  • FIG. 2 is a flowchart of a process of a mute indication method applied to a video conference according to an embodiment of the present invention
  • FIG. 3 is a flowchart according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another MCU device supporting video overlay muting prompt information and corresponding processing flow according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a corresponding processing flow according to an embodiment of the present invention
  • FIG. 6 is a display effect diagram of a mute prompt using a video superimposition manner according to an embodiment of the present invention
  • FIG. 7 is a display effect diagram of a mute prompt using a video insertion mode according to an embodiment of the present invention
  • 8 is a schematic structural diagram of a mute indication device applied to a video conference according to an embodiment of the present invention
  • FIG. 9 is a schematic structural diagram of an acquisition module according to an embodiment of the present invention.
  • the processing flow is as shown in FIG. 2, and includes: Step 202: The multi-point conference unit MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. Step 204: The MCU acquires the detection result of the terminal, where the detection result includes any one of the following: The state and the sound are not activated; Step 206: When the detection result is the sound activation state, the MCU superimposes the mute video indication in the video signal sent to the terminal. In the embodiment of the present invention, the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted.
  • the MCU When the detection result is the sound activation state, the MCU superimposes the silence video indication in the video signal sent to the terminal. .
  • a terminal when a terminal is muted, if the user of the terminal attempts to speak, a mute video indication message is displayed in the received video signal, for example, "You are currently prohibited from speaking, please first Conduct an application to speak.”
  • the purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient.
  • the advantage of the embodiment of the present invention is that the prompt information is intuitive, the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user.
  • the MCU performs sound activation detection (VAD) on the audio media stream sent by the terminal participating in the video conference, and the method includes: the MCU periodically performs sound activation detection on the audio media stream.
  • the MCU continuously performs sound activation detection on the audio media stream, and outputs a detection result of the sound activation state every time period ⁇ .
  • the detection result is two states, one is the sound activated state, and the other is the sound inactive state.
  • T1 can be used as an adjustable MCU configuration item.
  • the MCU obtains the detection result of the terminal, including: if the sound parameter of the audio media stream is higher than the threshold of the sound activation detection, the MCU determines that the detection result is a sound activation state; if the sound parameter of the audio media stream is not higher than the sound When the detected threshold is activated, the MCU determines that the detection result is a sound inactive state.
  • the threshold value of the VAD detection can be adjusted according to the specific situation.
  • the MCU may select to superimpose (or insert) the mute video indication or cancel the superimposition (or insertion) video mute indication in the video signal sent to the terminal. The MCU checks whether the terminal is muted.
  • the MCU superimposes the mute video indication in the video signal sent to the terminal, including: the MCU superimposes the text or the icon in the video signal sent to the terminal, and the text or icon is used to indicate that the terminal is Mute. Properties such as text, icon, font size, color, display position, etc.
  • the MCU performs the iterative process of superimposing the mute video indication in each video frame sent to the terminal until the mute video indication is cancelled. Unmute the video indicator without superimposing the video frame. It can be known from the above description that after inserting the mute video indication, the MCU replaces the normal conference video stream with the mute cue video stream. The mute prompt video stream contains text or icon information to indicate that the terminal is muted. Unmute the video indicator to resume sending the normal conference video stream.
  • FIG. 3 depicts an MCU device and corresponding processing flow for supporting video overlay mute prompt information based on an embodiment of the present invention.
  • the network interface module 301 is responsible for communication with the terminal and is responsible for transmitting and receiving sound and video media streams.
  • the network interface module 301 sends the received audio stream (1) to the audio decoding module 302, and the audio decoding module 302 decodes the compressed audio format into an original format audio stream, and then sends the original format audio stream (2) separately.
  • the sound module 303 and the sound activation detecting module 304, the mixing module 303 is responsible for mixing the audio streams from the multiple terminals to achieve the effect of the multi-party call, and the mixing module 303 sends the mixed audio stream (4).
  • the audio encoding module 305 is configured to compress and encode the original audio, and send the encoded audio stream (3) to the network interface module 301.
  • the network interface module 301 sends the received video stream (5) to the video decoding module 306.
  • the sound activation detection module 304 is responsible for performing sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted.
  • the value of T1 is 1000 ms
  • the voice activation detection module 304 activates the voice every 1000 ms.
  • the main control module 307 is responsible for determining whether a video mute indication is needed. If the terminal is muted, if the status indication of the sound activation is received, the mute video indication needs to be sent to the terminal, and in other cases, the mute video indication is stopped.
  • the main control module 307 sends a command to send the mute video indication (8) to the graphics overlay module 308, and the video decoding module 306 sends the video code stream (6) of the original format sent to the terminal to the graphics overlay module 308.
  • the graphics overlay module 308 is responsible for superimposing the muting prompt information into the video code stream of the original format sent to the terminal, and then sending the superimposed original format video stream (9) to the video encoder 309, and the video encoder 309 will use the original format. After the video stream is compressed and encoded, it is sent to the network interface module 301 and sent by the network interface module 301 to the terminal.
  • FIG. 4 depicts another MCU apparatus and processing flow for supporting mute prompt information in a video insertion mode based on an embodiment of the present invention.
  • the network interface module 401 is responsible for communication with the terminal, and is responsible for transmitting and receiving sound and video media streams.
  • the network interface module 401 sends the received audio stream (1) to the audio decoding module 402, and the audio decoding module 402 decodes the compressed audio format into the original format audio stream, and then the original format audio stream (2) respectively.
  • the mixing module 403 and the sound activation detecting module 404 are responsible for mixing the audio streams from the multiple terminals to achieve the effect of the multi-party call, and the mixing module 403 will mix the audio stream (4).
  • the audio encoding module 405 is responsible for compressing and encoding the original audio, and sending the encoded audio stream (3) to the network interface module 401.
  • the video mixing and switching module 406 receives the video stream (5) sent by the terminal, combines the video of the plurality of terminals into one multi-picture video, or selects the video input of a certain terminal to be exchanged to other terminals, and the video mixing and switching module 406 The output video stream (6) is sent to the video switching module 407.
  • the sound activation detection module 404 is responsible for performing sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted.
  • the value of T1 is 1000 ms
  • the voice activation detection module 404 activates the voice every 1000 ms.
  • (8) is reported to the main control module 408.
  • the main control module 408 is responsible for determining whether a video mute indication is needed. If the terminal is muted, if the status indication of the sound activation is received, the mute video indication needs to be sent to the terminal, and in other cases, the mute video indication is stopped.
  • the main control module 408 sends a command to send the mute video indication (9) to the video switching module 407.
  • the video switching module 407 selects a normal conference video stream (6) or a muting prompt video stream (7) to send to the terminal according to the command of the main control module 408.
  • the video prompting module 409 is used to output a mute prompt video stream (7).
  • the advantage of using a mute video prompt is to save media computing resources. Usually video overlay operations consume CPU resources.
  • FIG. 5 is a flowchart of processing according to an embodiment of the present invention, which is illustrated based on the MCU embodiment of FIG. 3.
  • Step 501 Accept audio stream data in an original format input by the terminal, for example, receive audio data corresponding to a duration of 100 ms
  • Step 502 Perform sound activation detection by using the latest received audio stream data, depending on the VAD Algorithm, the calculation may need to use the saved historical audio stream data and the previous calculation result; the VAD decision threshold may be configured by the user, the decision sensitivity may be adjusted; Step 503, output the sound activation state; the execution body of steps 501 to 503 may be set to VAD After the execution of step 503, the process returns to 501 to repeat the execution; subsequently, the sound activation state is output to the main control module, and the main control module performs the subsequent steps 511 to 515; Step 511, receiving the input and updating the sound activation state; And determining whether it is a sound activated state, if it is a sound activated state, performing step 513, if it is a non-sound active state, performing step 515; Step 513, determining whether it is a sound activated state,
  • Step 514 Send a request overlay prompt message, notify the video overlay module to perform video overlay, and return to step 511 to repeat the execution;
  • Step 515 send a cancel overlay prompt message, notify the video overlay module to cancel the video overlay, and return to step 512 to repeat the execution;
  • the request superimposed prompt message or the cancel superimposed prompt message is output to the video superimposing module, and the subsequent steps 521 to 524 are performed by the video superimposing module;
  • Step 521 the video superimposing module updates the video superimposed state according to the input of the main control module;
  • the video overlay module determines whether to perform video overlay, if step 523 is performed, otherwise step 524 is performed;
  • step 523 the video overlay module superimposes the prompt information into the video signal sent to the terminal, and the prompt information may be an icon or descriptive expression that is silent.
  • the text string; the content of the prompt text, font, text size, color, display position and other attributes can be used as an adjustable configuration item; Step 524, the video overlay module does not perform overlay processing.
  • the mute indication method provided by the embodiment of the present invention can generate a mute prompt in the video.
  • FIG. 6 is a display effect of a mute prompt using a video overlay mode, and an outer rectangular box represents a television screen, and a character icon is used. Indicates the video signal watched by the terminal.
  • the text side at the bottom is the superimposed mute prompt message. For example, if you are currently prohibited from speaking, please apply for a speech first.
  • the embodiment of the present invention further provides a mute indication device applied to a video conference, and the structure thereof is as shown in FIG. 8 , and is disposed in the multi-point conference unit MCU, and includes: a detection module 801, configured to participate in The audio media stream sent by the terminal that has been muted by the video conference is subjected to sound activation detection.
  • the acquisition module 802 is coupled to the detection module 801 and configured to acquire the detection result of the terminal, wherein the detection result includes any one of the following: a sound activation state and The sound is not activated;
  • the superimposing module 803 is coupled to the acquisition module 802, and is configured to superimpose the mute video indication in a video signal sent to the terminal when the detection result is a sound activation state.
  • the detection module 801 can also be configured to periodically perform sound activation detection on the audio media stream.
  • the obtaining module 802 may include: a first determining submodule 901, configured to determine that the detection result is sound if the sound parameter of the audio media stream is higher than a threshold value of the sound activation detection.
  • the second determination sub-module 902 is configured to determine that the detection result is a sound inactive state if the sound parameter of the audio media stream is not higher than the threshold value of the sound activation detection.
  • the first determining submodule 901 and the second determining submodule 902 are two parallel functional modules, which are respectively coupled to the detecting module 801.
  • the overlay module 803 can also be configured to superimpose text or icons in the video signal sent to the terminal, the text or icon being used to indicate that the terminal is muted.
  • the overlay module 803 can also be configured to perform a repetitive process of superimposing the mute video indication on each video frame sent to the terminal until the mute video indication is cancelled.
  • the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted, when the detection result is When the sound is activated, the MCU superimposes the mute video indication in the video signal sent to the terminal.
  • the MCU when a terminal is muted, if the user of the terminal attempts to speak, a mute video indication message is displayed in the received video signal, for example, "You are currently prohibited from speaking, please first Conduct an application to speak.”
  • the purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient.
  • the advantage of the embodiment of the present invention is that the information is improved intuitively, and the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user.
  • the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into respective integrated circuit modules. Alternatively, multiple modules or steps of them can be implemented as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software.

Abstract

Disclosed in the invention are a mute indication method and device applicable to video conferencing. The method comprises: a MCU carries out sound activation detection for a voice frequency media flow sent by a terminal which participates in the video conference and is muted; the MCU obtains the detection result of the terminal, wherein the detection result comprises one of the following states: sound activation state and sound inactivation state; and when the detection result is the sound activation state, the MCU superposes the mute video indication to the video signal of the terminal. Using the invention can solve the problem in the relevant technology that mute indication by using prompt tone is not intuitive and to some extent disturbs listening to the normal conference voice.

Description

应用于视频会议的静音指示方法及装置 技术领域 本发明涉及通信领域, 具体而言, 涉及一种应用于视频会议的静音指示方法及装 置。 背景技术 视频会议系统是支持声音、 视频远程双向传送的多媒体通信系统, 它用来帮助身 处异地的使用者完成实时双向的面对面可视沟通。 国际电信联盟(ITU)、 互联网工程任务组(IETF)、 第三代合作伙伴项目 (3GPP) 等标准组织各自从事多媒体标准化的开发。ITU目前开发了 ITU-T H.320、ITU-T H.323、 ITU-T H.324等多个多媒体通信标准, 其中, ITU-T H.320是针对窄带电路交换网络的 多媒体通信应用, ITU-T H.323是针对 IP网络的多媒体通信应用, 而 ITU-T H.324是 针对非常低速的网络的多媒体通信应用,如 PSTN(Public Switched Telephone Network, 公共交换电话网) 网络和移动网络。 IETF则负责制定会话初始协议 SIP和基于此协议 的多媒体会议标准。 3GPP是负责制定 IP多媒体子系统 IMS的标准, 它在 IETF标准 基础上也制定了一套基于 IMS网络的多媒体会议标准, 这个标准和 IETF制定的基于 SIP的标准是非常接近的。 图 1描述了视频会议通信的基本原理。终端 101是用户使用的设备,包括终端 l~n。 每个终端内包含编解码器, 编解码负责完成声音、 视频等媒体的压缩编码和解码; 终 端还连接麦克风, 摄像头, 显示器, 声音播放子系统, 用来完成声音、 视频的输入和 输出; 终端还包括用户输入接口, 用户通过输入接口向终端输入指令和信息。 在召开 视频会议时, 终端 101和 MCU (Multipoint Control Unit, 多点控制单元) 102建立连 接, 包括控制信令、 音频、 视频的双向通信, 为了节省网络带宽, 音频和视频一般采 用压缩编码的格式在网络上传送。  TECHNICAL FIELD The present invention relates to the field of communications, and in particular to a mute indication method and apparatus for video conferencing. BACKGROUND OF THE INVENTION A video conferencing system is a multimedia communication system that supports remote bidirectional transmission of sound and video. It is used to help users in different places to perform real-time two-way face-to-face visual communication. Standards organizations such as the International Telecommunication Union (ITU), the Internet Engineering Task Force (IETF), and the Third Generation Partnership Project (3GPP) are each engaged in the development of multimedia standardization. The ITU currently develops multiple multimedia communication standards such as ITU-T H.320, ITU-T H.323, and ITU-T H.324. Among them, ITU-T H.320 is a multimedia communication application for narrowband circuit switched networks. ITU-T H.323 is a multimedia communication application for IP networks, and ITU-T H.324 is a multimedia communication application for very low-speed networks, such as PSTN (Public Switched Telephone Network) networks and mobile networks. . The IETF is responsible for the development of the Session Initiation Protocol SIP and multimedia conferencing standards based on this protocol. 3GPP is responsible for the development of the IP Multimedia Subsystem IMS standard. It also develops an IMS network-based multimedia conferencing standard based on the IETF standard. This standard is very close to the SIP-based standard established by the IETF. Figure 1 depicts the basic principles of video conferencing communication. The terminal 101 is a device used by the user, including terminals l~n. Each terminal contains a codec, which is responsible for compression encoding and decoding of sound, video and other media; the terminal is also connected to a microphone, a camera, a display, a sound playing subsystem for completing sound and video input and output; A user input interface is also included, and the user inputs commands and information to the terminal through the input interface. When a video conference is held, the terminal 101 establishes a connection with an MCU (Multipoint Control Unit) 102, including two-way communication of control signaling, audio, and video. To save network bandwidth, audio and video are generally in a compression-encoded format. Transfer on the network.
MCU 102用来完成多方会议通信。 参加多方会议通信的终端 101和 MCU 102建 立连接, 进行控制信令、 音频、 视频的双向通信。 MCU 102负责完成媒体流的交换和 混合。 对于声音媒体流, MCU 102通常为每个终端 101输出一个经过混音合成的声音 媒体流, 混音合成一般选择输入的音量最大的几路声音媒体流进行叠加。 对于视频, MCU 102可以为某个终端发送另一个终端的单画面视频流, 如果 MCU 102支持多画 面功能, 也能够把多个终端来的视频合成为一个多画面图像, 然后发送给某个或某些 终端。 在视频会议中, 为了满足用户对会议管理的需要, 一般均提供会议控制功能。 图 1中的会议控制软件 103用来完成会议控制功能。 会议控制软件 103的一个重要的功 能是对终端进行静音控制, 为了达到好的声音沟通效果, 通常会对当前不需要发言的 终端进行静音操作, 一个终端被静音后, 参与同一个会议的其它终端无法听到该终端 的发言。 如果被静音的终端未被通知自身被静音, 该终端的用户会尝试进行发言操作, 但 是在其它终端侧的用户又听不到他的发言, 会误解为系统故障, 引起易用性的下降。 传统的音频会议系统中, 一般是通过给被静音的终端播放一种特殊的提示音的, 例如间隔地播放 "嘟"音。 这种做法的缺点是提示不够直观, 而且一定程度上干扰正 常会议声音的收听。 针对相关技术中静音提示采用提示音方式, 不够直观, 而且一定程度上干扰正常 会议声音的收听的问题, 目前尚未提出有效的解决方案。 发明内容 本发明旨在提供一种应用于视频会议的静音指示方法及装置, 以解决相关技术中 静音提示采用提示音方式, 不够直观, 而且一定程度上干扰正常会议声音的收听的问 题。 根据本发明的一个方面, 提供了一种应用于视频会议的静音指示方法, 包括: 多 点会议单元(MCU)对参与视频会议且已被静音的终端发送的音频媒体流进行声音激 活检测; 所述 MCU获取所述终端的检测结果, 其中, 所述检测结果包括下列任意之 一: 声音激活状态和声音未激活状态; 当所述检测结果为声音激活状态时, 所述 MCU 在发送给所述终端的视频信号中叠加所述静音视频指示。 优选的,所述 MCU对参与视频会议的终端发送的音频媒体流进行声音激活检测, 包括: 所述 MCU周期性对所述音频媒体流进行声音激活检测。 优选的, 所述 MCU获取所述终端的检测结果, 包括: 若所述音频媒体流的声音 参数高于所述声音激活检测的门限值时, 所述 MCU确定所述检测结果为声音激活状 态; 若所述音频媒体流的声音参数不高于所述声音激活检测的门限值时, 所述 MCU 确定所述检测结果为声音未激活状态。 优选的, 所述 MCU在发送给所述终端的视频信号中叠加所述静音视频指示, 包 括: 所述 MCU在发送给所述终端的视频信号中叠加文字或图标, 所述文字或图标用 于指示所述终端被静音。 优选的, 所述 MCU在发送给所述终端的视频信号中叠加所述静音视频指示, 包 括: 所述 MCU在发送给所述终端的每个视频帧进行叠加所述静音视频指示的重复处 理, 直至取消所述静音视频指示。 根据本发明的另一方面, 提供了一种应用于视频会议的静音指示装置, 设置于 MCU中, 包括: 检测模块, 设置为对参与视频会议且已被静音的终端发送的音频媒体 流进行声音激活检测; 获取模块, 设置为获取所述终端的检测结果, 其中, 所述检测 结果包括下列任意之一: 声音激活状态和声音未激活状态; 叠加模块, 设置为当所述 检测结果为声音激活状态时,在发送给所述终端的视频信号中叠加所述静音视频指示。 优选的, 所述检测模块还设置为周期性对所述音频媒体流进行声音激活检测。 优选的, 所述获取模块包括: 第一确定子模块, 设置为若所述音频媒体流的声音 参数高于所述声音激活检测的门限值时, 确定所述检测结果为声音激活状态; 第二确 定子模块,设置为若所述音频媒体流的声音参数不高于所述声音激活检测的门限值时, 确定所述检测结果为声音未激活状态。 优选的,所述叠加模块还设置为在发送给所述终端的视频信号中叠加文字或图标, 所述文字或图标用于指示所述终端被静音。 优选的, 所述叠加模块还设置为在发送给所述终端的每个视频帧进行叠加所述静 音视频指示的重复处理, 直至取消所述静音视频指示。 在本发明实施例中, MCU对参与视频会议且已被静音的终端发送的音频媒体流进 行声音激活检测, 当检测结果为声音激活状态时, MCU在发送给终端的视频信号中叠 加静音视频指示。 本发明实施例中, 当某个终端被静音后, 如果该终端的用户尝试发 言时, 在接收的视频信号中, 就会显示一个静音视频指示消息, 例如显示 "你当前被 禁止发言,请先进行申请发言操作"。本发明实施例的目的是改善视频会议的沟通体验, 让视频会议使用简单高效。 本发明实施例的优点是, 提示信息直观, 提示信息内容可 以丰富准确, 提示信息是动态出现, 正常情况下没有提示, 确保对用户的干扰最小。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部分, 本发 明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的不当限定。 在附图 中: 图 1是根据相关技术的视频会议通信的基本原理示意图; 图 2是根据本发明实施例的应用于视频会议的静音指示方法的处理流程图; 图 3是根据本发明实施例的支持视频叠加静音提示信息的 MCU装置及相应处理 流程示意图; 图 4是根据本发明实施例的支持视频叠加静音提示信息的另外一个 MCU装置及 相应处理流程示意图; 图 5是根据本发明实施例的具体实施例的处理流程图; 图 6是根据本发明实施例的采用视频叠加方式的静音提示的显示效果图; 图 7是根据本发明实施例的采用视频插入方式的静音提示的显示效果图; 图 8是根据本发明实施例的应用于视频会议的静音指示装置的结构示意图; 图 9是根据本发明实施例的获取模块的结构示意图。 具体实施方式 下面将参考附图并结合实施例, 来详细说明本发明。 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在不冲突的 情况下, 本申请中的实施例及实施例中的特征可以相互组合。 传统的音频会议系统中, 一般是通过给被静音的终端播放一种特殊的提示音的, 例如间隔地播放 "嘟"音。 这种做法的缺点是提示不够直观, 而且一定程度上干扰正 常会议声音的收听。 为解决上述技术问题,本发明实施例提供了一种应用于视频会议的静音指示方法, 处理流程如图 2所示, 包括: 步骤 202、多点会议单元 MCU对参与视频会议且已被静音的终端发送的音频媒体 流进行声音激活检测; 步骤 204、 MCU获取终端的检测结果, 其中, 检测结果包括下列任意之一: 声音 激活状态和声音未激活状态; 步骤 206、 当检测结果为声音激活状态时, MCU在发送给终端的视频信号中叠加 静音视频指示。 在本发明实施例中, MCU对参与视频会议且已被静音的终端发送的音频媒体流进 行声音激活检测, 当检测结果为声音激活状态时, MCU在发送给终端的视频信号中叠 加静音视频指示。 本发明实施例中, 当某个终端被静音后, 如果该终端的用户尝试发 言时, 在接收的视频信号中, 就会显示一个静音视频指示消息, 例如显示 "你当前被 禁止发言,请先进行申请发言操作"。本发明实施例的目的是改善视频会议的沟通体验, 让视频会议使用简单高效。 本发明实施例的优点是, 提示信息直观, 提示信息内容可 以丰富准确, 提示信息是动态出现, 正常情况下没有提示, 确保对用户的干扰最小。 优选的, MCU 对参与视频会议的终端发送的音频媒体流进行声音激活检测 (VAD), 包括: MCU周期性对音频媒体流进行声音激活检测。 MCU持续地对音频 媒体流进行声音激活检测, 每隔一段时间 τι, 输出一次声音激活状态的检测结果。 检 测结果为两个状态, 一个是声音激活态, 另一个是声音未激活态。 T1可以作为可调节 的 MCU配置项。 优选的, MCU获取终端的检测结果, 包括: 若音频媒体流的声音参数高于声音激 活检测的门限值时, MCU确定检测结果为声音激活状态;若音频媒体流的声音参数不 高于声音激活检测的门限值时, MCU确定检测结果为声音未激活状态。 VAD检测的 门限值可根据具体情况调节。 实施时,根据步骤 204的判断结果, MCU可以选择在发送给终端的视频信号中叠 加 (或插入)静音视频指示或取消叠加 (或插入)视频静音指示。 MCU检查终端是否 被静音, 如果被静音, 则进一步判断当前终端的发送的音频媒体流是否为激活状态, 如果是声音激活状态, 则需要向该终端发送静音视频指示, 其它条件为停止发送静音 视频指示。 其中, 被静音是指在 MCU内部的声音处理, 能够在视频会议中阻止参与 视频会议的其它终端收听到该终端的声音。 优先的,步骤 206在实施时, MCU在发送给终端的视频信号中叠加静音视频指示, 包括: MCU在发送给终端的视频信号中叠加文字或图标,文字或图标用于指示终端被 静音。 文字或图标的内容、 字体、 文字大小、 颜色、 显示位置等属性可以作为可调节 的配置项。 实施时, MCU在发送给终端的每个视频帧中进行叠加静音视频指示的重复处理, 直至取消静音视频指示。 取消静音视频指示则不对视频帧进行叠加处理。 由上述说明可以获知,插入静音视频指示后, MCU用静音提示视频流替换正常的 会议视频流。 静音提示视频流包含文字或图标信息, 用来指示终端被静音。 取消静音 视频指示则恢复发送正常的会议视频流。 图 3描述了一个基于本发明实施例的支持视频叠加静音提示信息的 MCU装置及 相应处理流程。 网络接口模块 301负责和终端的通信, 负责收发声音、 视频媒体流。 网络接口模块 301将接收的音频流(1 )送给音频解码模块 302, 音频解码模块 302将 压缩的音频格式解码为原始格式音频码流, 然后将原始格式音频码流(2)分别送给混 音模块 303和声音激活检测模块 304, 混音模块 303负责将多路终端来的音频流进行 混合处理, 达到多方通话的效果, 混音模块 303将混音后的音频码流(4)送给音频编 码模块 305,音频编码模块 305负责对原始音频进行压缩编码,将编码后的音频流(3 ) 送给网络接口模块 301。网络接口模块 301将接收的视频流(5 )送给视频解码模块 306。 声音激活检测模块 304负责对参与视频会议且已被静音的终端发送的音频媒体流进行 声音激活检测,在本实施例中, T1取值为 1000ms,声音激活检测模块 304每隔 1000ms 将语音激活状态(7)上报给主控模块 307。 主控模块 307负责判断是否需要进行视频 静音指示, 在终端被静音的情况下, 如果收到声音激活的状态指示, 则需要向终端发 送静音视频指示, 其它情况下为停止发送静音视频指示。 主控模块 307将是否发送静 音视频指示 (8 ) 的命令发送给图形叠加模块 308, 视频解码模块 306将发送给终端的 原始格式的视频码流(6)发送至图形叠加模块 308。 图形叠加模块 308负责将静音提 示信息叠加到发送给终端的原始格式的视频码流中, 然后将叠加后的原始格式视频码 流 (9)送给视频编码器 309, 视频编码器 309将原始格式的视频码流压缩编码后, 送 给网络接口模块 301, 由网络接口模块 301发送给终端。 用户可以通过设备配置的方 式, 将音量比较门限、 音量大小计算样本数或相应的时间区间、 提示文字内容、 文字 颜色、 字体大小、 字体类型、 提示文字显示在视频帧中的位置预先设置到 MCU设备 中。 图 4 描述了另外一个基于本发明实施例的支持视频插入方式的静音提示信息的 MCU装置和处理流程。 网络接口模块 401负责和终端的通信, 负责收发声音、视频媒 体流。 网络接口模块 401将接收的音频流 (1 ) 送给音频解码模块 402, 音频解码模块 402将压缩的音频格式解码为原始格式音频码流, 然后将原始格式音频码流(2)分别 送给混音模块 403和声音激活检测模块 404, 混音模块 403负责将多路终端来的音频 流进行混合处理, 达到多方通话的效果, 混音模块 403将混音后的音频码流(4)送给 音频编码模块 405, 音频编码模块 405负责对原始音频进行压缩编码, 将编码后的音 频流(3 )送给网络接口模块 401。 视频混合和交换模块 406接收终端发送来的视频流 ( 5 ), 将多个终端的视频合成为一个多画面视频, 或是选择某个终端的视频输入交换 给其它终端, 视频混合和交换模块 406的输出视频流 (6)送给视频切换模块 407。 声 音激活检测模块 404负责对参与视频会议且已被静音的终端发送的音频媒体流进行声 音激活检测, 在本实施例中, T1取值为 1000ms, 声音激活检测模块 404每隔 1000ms 将语音激活状态(8 )上报给主控模块 408。 主控模块 408负责判断是否需要进行视频 静音指示, 在终端被静音的情况下, 如果收到声音激活的状态指示, 则需要向终端发 送静音视频指示, 其它情况下为停止发送静音视频指示。 主控模块 408将是否发送静 音视频指示(9)的命令发送给视频切换模块 407。视频切换模块 407根据主控模块 408 的命令, 选择正常的会议视频流 (6) 或静音提示视频流 (7) 发送给终端。 视频提示 模块 409用来输出静音提示视频流 (7)。 采用插入静音视频提示的优点是可节省媒体 计算资源。 通常视频叠加操作比较消耗 CPU资源。 图 5为本发明实施例的处理流程图, 该流程图是基于图 3的 MCU实施例来说明 的。 流程的具体步骤如下: 步骤 501、 接受终端输入的原始格式的音频流数据, 例如接收相当于持续时间为 100ms的音频数据; 步骤 502、 利用最新接收的音频流数据进行声音激活检测, 依赖于 VAD算法, 计 算可能需要使用保存的历史音频流数据和先前的计算结果; VAD判决门限可由用户配 置, 可调节判决灵敏度; 步骤 503、 输出声音激活状态; 步骤 501至步骤 503的执行主体可以设置为 VAD模块,步骤 503执行结束后,返 回到 501重复执行; 后续将声音激活状态输出至主控模块, 由主控模块执行后续步骤 511至步骤 515 ; 步骤 511、 接收输入并更新声音激活状态; 步骤 512、 判断是否为声音激活态, 如果为声音激活态, 执行步骤 513, 如果为非 声音激活态, 执行步骤 515 ; 步骤 513、 判断此终端是否被静音, 如果被静音, 执行步骤 514, 否则执行步骤The MCU 102 is used to perform multi-party conference communication. The terminal 101 participating in the multi-party conference communication establishes a connection with the MCU 102 to perform two-way communication of control signaling, audio, and video. The MCU 102 is responsible for completing the exchange and mixing of the media streams. For the sound media stream, the MCU 102 generally outputs a sound-mixed sound media stream for each terminal 101, and the sound synthesis is generally superimposed by a plurality of sound media streams having the highest volume input. For video, the MCU 102 can send a single-picture video stream of another terminal for a certain terminal, if the MCU 102 supports multiple pictures. The face function can also combine the video from multiple terminals into one multi-picture image and then send it to one or some terminals. In video conferencing, in order to meet the needs of users for conference management, conference control functions are generally provided. The conference control software 103 of Figure 1 is used to complete the conference control function. An important function of the conference control software 103 is to perform mute control on the terminal. In order to achieve a good voice communication effect, the terminal that does not need to speak at present is usually muted, and after one terminal is muted, the other terminals participating in the same conference are engaged. The terminal's speech could not be heard. If the muted terminal is not notified that it is muted, the user of the terminal will attempt to perform the speaking operation, but the user on the other terminal side may not hear his speech, which may be misunderstood as a system failure, causing a decrease in ease of use. In a conventional audio conference system, a special tone is generally played by a mute terminal, for example, a "beep" tone is played at intervals. The disadvantage of this approach is that the prompts are not intuitive enough, and to some extent interfere with the listening of normal conference sounds. In view of the fact that the mute prompt in the related art adopts the prompt sound mode, it is not intuitive enough, and to some extent interferes with the problem of listening to the normal conference sound, and no effective solution has been proposed yet. SUMMARY OF THE INVENTION The present invention is directed to a method and apparatus for mute indication applied to a video conference, so as to solve the problem that the mute prompt adopts a prompt tone mode in the related art, which is not intuitive enough, and interferes to the listening of the normal conference sound to some extent. According to an aspect of the present invention, a mute indication method applied to a video conference is provided, including: a multipoint conference unit (MCU) performs sound activation detection on an audio media stream sent by a terminal that has participated in a video conference and has been muted; The MCU obtains the detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivity state; when the detection result is a sound activation state, the MCU is sent to the The mute video indication is superimposed in the video signal of the terminal. Preferably, the MCU performs sound activation detection on the audio media stream sent by the terminal participating in the video conference, and the method includes: the MCU periodically performing sound activation detection on the audio media stream. Preferably, the acquiring, by the MCU, the detection result of the terminal, includes: if the sound parameter of the audio media stream is higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound activation state. And if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound inactive state. Preferably, the MCU superimposes the mute video indication in a video signal sent to the terminal, where: the MCU superimposes a text or an icon in a video signal sent to the terminal, where the text or icon is used The terminal is instructed to be muted. Preferably, the superimposing the mute video indication in the video signal sent by the MCU to the terminal includes: repeating, by the MCU, superimposing the mute video indication on each video frame sent to the terminal, Until the mute video indication is cancelled. According to another aspect of the present invention, a mute indication device for a video conference is provided, which is disposed in an MCU, and includes: a detection module configured to perform sound on an audio media stream sent by a terminal that has participated in a video conference and has been muted. The detection module is configured to obtain the detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivity state; and an overlay module configured to activate when the detection result is sound In the state, the mute video indication is superimposed in the video signal transmitted to the terminal. Preferably, the detecting module is further configured to periodically perform sound activation detection on the audio media stream. Preferably, the acquiring module includes: a first determining submodule, configured to determine that the detection result is a sound activated state if a sound parameter of the audio media stream is higher than a threshold value of the sound activation detection; And determining, by the second determining submodule, if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection, determining that the detection result is a sound inactive state. Preferably, the superimposing module is further configured to superimpose a text or an icon in the video signal sent to the terminal, where the text or icon is used to indicate that the terminal is muted. Preferably, the superposition module is further configured to perform a repetitive process of superimposing the mute video indication on each video frame sent to the terminal until the mute video indication is cancelled. In the embodiment of the present invention, the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. When the detection result is the sound activation state, the MCU superimposes the silence video indication in the video signal sent to the terminal. . In the embodiment of the present invention, when a terminal is muted, if the user of the terminal attempts to speak, a mute video indication message is displayed in the received video signal, for example, "You are currently prohibited from speaking, please first Conduct an application to speak." The purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient. The advantage of the embodiment of the present invention is that the prompt information is intuitive, the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, In the drawings: FIG. 1 is a schematic diagram of a basic principle of a video conference communication according to the related art; FIG. 2 is a flowchart of a process of a mute indication method applied to a video conference according to an embodiment of the present invention; FIG. 3 is a flowchart according to an embodiment of the present invention. FIG. 4 is a schematic diagram of another MCU device supporting video overlay muting prompt information and corresponding processing flow according to an embodiment of the present invention; FIG. 5 is a schematic diagram of a corresponding processing flow according to an embodiment of the present invention; FIG. 6 is a display effect diagram of a mute prompt using a video superimposition manner according to an embodiment of the present invention; FIG. 7 is a display effect diagram of a mute prompt using a video insertion mode according to an embodiment of the present invention; 8 is a schematic structural diagram of a mute indication device applied to a video conference according to an embodiment of the present invention; and FIG. 9 is a schematic structural diagram of an acquisition module according to an embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments. The invention will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict. In a conventional audio conference system, a special tone is generally played by a mute terminal, for example, a "beep" tone is played at intervals. The disadvantage of this approach is that the prompts are not intuitive enough, and to some extent interfere with the listening of normal conference sounds. To solve the above technical problem, the embodiment of the present invention provides a mute indication method applied to a video conference. The processing flow is as shown in FIG. 2, and includes: Step 202: The multi-point conference unit MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. Step 204: The MCU acquires the detection result of the terminal, where the detection result includes any one of the following: The state and the sound are not activated; Step 206: When the detection result is the sound activation state, the MCU superimposes the mute video indication in the video signal sent to the terminal. In the embodiment of the present invention, the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. When the detection result is the sound activation state, the MCU superimposes the silence video indication in the video signal sent to the terminal. . In the embodiment of the present invention, when a terminal is muted, if the user of the terminal attempts to speak, a mute video indication message is displayed in the received video signal, for example, "You are currently prohibited from speaking, please first Conduct an application to speak." The purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient. The advantage of the embodiment of the present invention is that the prompt information is intuitive, the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user. Preferably, the MCU performs sound activation detection (VAD) on the audio media stream sent by the terminal participating in the video conference, and the method includes: the MCU periodically performs sound activation detection on the audio media stream. The MCU continuously performs sound activation detection on the audio media stream, and outputs a detection result of the sound activation state every time period τι. The detection result is two states, one is the sound activated state, and the other is the sound inactive state. T1 can be used as an adjustable MCU configuration item. Preferably, the MCU obtains the detection result of the terminal, including: if the sound parameter of the audio media stream is higher than the threshold of the sound activation detection, the MCU determines that the detection result is a sound activation state; if the sound parameter of the audio media stream is not higher than the sound When the detected threshold is activated, the MCU determines that the detection result is a sound inactive state. The threshold value of the VAD detection can be adjusted according to the specific situation. In implementation, according to the determination result of step 204, the MCU may select to superimpose (or insert) the mute video indication or cancel the superimposition (or insertion) video mute indication in the video signal sent to the terminal. The MCU checks whether the terminal is muted. If it is muted, it further determines whether the audio media stream sent by the current terminal is in an active state. If it is a voice activated state, it needs to send a mute video indication to the terminal, and other conditions are to stop sending the mute video. Instructions. The muting refers to the sound processing inside the MCU, and can prevent other terminals participating in the video conference from listening to the sound of the terminal in the video conference. Preferably, in step 206, the MCU superimposes the mute video indication in the video signal sent to the terminal, including: the MCU superimposes the text or the icon in the video signal sent to the terminal, and the text or icon is used to indicate that the terminal is Mute. Properties such as text, icon, font size, color, display position, etc. can be used as adjustable configuration items. In implementation, the MCU performs the iterative process of superimposing the mute video indication in each video frame sent to the terminal until the mute video indication is cancelled. Unmute the video indicator without superimposing the video frame. It can be known from the above description that after inserting the mute video indication, the MCU replaces the normal conference video stream with the mute cue video stream. The mute prompt video stream contains text or icon information to indicate that the terminal is muted. Unmute the video indicator to resume sending the normal conference video stream. FIG. 3 depicts an MCU device and corresponding processing flow for supporting video overlay mute prompt information based on an embodiment of the present invention. The network interface module 301 is responsible for communication with the terminal and is responsible for transmitting and receiving sound and video media streams. The network interface module 301 sends the received audio stream (1) to the audio decoding module 302, and the audio decoding module 302 decodes the compressed audio format into an original format audio stream, and then sends the original format audio stream (2) separately. The sound module 303 and the sound activation detecting module 304, the mixing module 303 is responsible for mixing the audio streams from the multiple terminals to achieve the effect of the multi-party call, and the mixing module 303 sends the mixed audio stream (4). The audio encoding module 305 is configured to compress and encode the original audio, and send the encoded audio stream (3) to the network interface module 301. The network interface module 301 sends the received video stream (5) to the video decoding module 306. The sound activation detection module 304 is responsible for performing sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. In this embodiment, the value of T1 is 1000 ms, and the voice activation detection module 304 activates the voice every 1000 ms. (7) Reported to the main control module 307. The main control module 307 is responsible for determining whether a video mute indication is needed. If the terminal is muted, if the status indication of the sound activation is received, the mute video indication needs to be sent to the terminal, and in other cases, the mute video indication is stopped. The main control module 307 sends a command to send the mute video indication (8) to the graphics overlay module 308, and the video decoding module 306 sends the video code stream (6) of the original format sent to the terminal to the graphics overlay module 308. The graphics overlay module 308 is responsible for superimposing the muting prompt information into the video code stream of the original format sent to the terminal, and then sending the superimposed original format video stream (9) to the video encoder 309, and the video encoder 309 will use the original format. After the video stream is compressed and encoded, it is sent to the network interface module 301 and sent by the network interface module 301 to the terminal. The user can pre-set the volume comparison threshold, the volume calculation sample number or the corresponding time interval, the prompt text content, the text color, the font size, the font type, and the prompt text in the video frame by the device configuration mode to the MCU. In the device. FIG. 4 depicts another MCU apparatus and processing flow for supporting mute prompt information in a video insertion mode based on an embodiment of the present invention. The network interface module 401 is responsible for communication with the terminal, and is responsible for transmitting and receiving sound and video media streams. The network interface module 401 sends the received audio stream (1) to the audio decoding module 402, and the audio decoding module 402 decodes the compressed audio format into the original format audio stream, and then the original format audio stream (2) respectively. The mixing module 403 and the sound activation detecting module 404 are responsible for mixing the audio streams from the multiple terminals to achieve the effect of the multi-party call, and the mixing module 403 will mix the audio stream (4). The audio encoding module 405 is responsible for compressing and encoding the original audio, and sending the encoded audio stream (3) to the network interface module 401. The video mixing and switching module 406 receives the video stream (5) sent by the terminal, combines the video of the plurality of terminals into one multi-picture video, or selects the video input of a certain terminal to be exchanged to other terminals, and the video mixing and switching module 406 The output video stream (6) is sent to the video switching module 407. The sound activation detection module 404 is responsible for performing sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted. In this embodiment, the value of T1 is 1000 ms, and the voice activation detection module 404 activates the voice every 1000 ms. (8) is reported to the main control module 408. The main control module 408 is responsible for determining whether a video mute indication is needed. If the terminal is muted, if the status indication of the sound activation is received, the mute video indication needs to be sent to the terminal, and in other cases, the mute video indication is stopped. The main control module 408 sends a command to send the mute video indication (9) to the video switching module 407. The video switching module 407 selects a normal conference video stream (6) or a muting prompt video stream (7) to send to the terminal according to the command of the main control module 408. The video prompting module 409 is used to output a mute prompt video stream (7). The advantage of using a mute video prompt is to save media computing resources. Usually video overlay operations consume CPU resources. FIG. 5 is a flowchart of processing according to an embodiment of the present invention, which is illustrated based on the MCU embodiment of FIG. 3. The specific steps of the process are as follows: Step 501: Accept audio stream data in an original format input by the terminal, for example, receive audio data corresponding to a duration of 100 ms; Step 502: Perform sound activation detection by using the latest received audio stream data, depending on the VAD Algorithm, the calculation may need to use the saved historical audio stream data and the previous calculation result; the VAD decision threshold may be configured by the user, the decision sensitivity may be adjusted; Step 503, output the sound activation state; the execution body of steps 501 to 503 may be set to VAD After the execution of step 503, the process returns to 501 to repeat the execution; subsequently, the sound activation state is output to the main control module, and the main control module performs the subsequent steps 511 to 515; Step 511, receiving the input and updating the sound activation state; And determining whether it is a sound activated state, if it is a sound activated state, performing step 513, if it is a non-sound active state, performing step 515; Step 513, determining whether the terminal is muted, if it is muted, performing step 514, otherwise performing steps
515; 步骤 514、 发送请求叠加提示消息, 通知视频叠加模块进行视频叠加, 返回到步 骤 511重复执行; 步骤 515、 发送取消叠加提示消息, 通知视频叠加模块取消视频叠加, 返回到步 骤 512重复执行; 后续将请求叠加提示消息或者取消叠加提示消息输出至视频叠加模块, 由视频叠 加模块执行后续步骤 521至步骤 524; 步骤 521、 视频叠加模块根据主控模块的输入, 更新视频叠加状态; 步骤 522、视频叠加模块判断是否进行视频叠加, 如果是执行步骤 523, 否则执行 步骤 524; 步骤 523、 视频叠加模块将提示信息叠加到发送给终端的视频信号中, 提示信息 可以是表达静音的图标或描述性的文字串; 提示文字的内容、 字体、 文字大小、 颜色、 显示位置等属性可以作为可调节的配置项; 步骤 524、 视频叠加模块不进行叠加处理。 采用本发明实施例提供的静音指示方法, 能够在视频中生成静音提示, 例如, 图 6 为采用视频叠加方式的静音提示的一种显示效果, 外层矩形方框表示电视屏幕, 人 物图标用来表示终端收看的视频信号, 底部的文字侧为叠加的静音提示信息, 比如, 你当前被禁止发言, 请先申请发言。 再例如, 图 7为采用视频插入方式的静音提示的 显示效果, 外层矩形方框表示电视屏幕。 基于同一发明构思,本发明实施例还提供了一种应用于视频会议的静音指示装置, 其结构如图 8所示, 设置于多点会议单元 MCU中, 包括: 检测模块 801, 设置为对参与视频会议且已被静音的终端发送的音频媒体流进行 声音激活检测; 获取模块 802, 与检测模块 801耦合, 设置为获取终端的检测结果, 其中, 检测 结果包括下列任意之一: 声音激活状态和声音未激活状态; 叠加模块 803, 与获取模块 802耦合, 设置为当检测结果为声音激活状态时, 在 发送给终端的视频信号中叠加所述静音视频指示。 在一个实施例中, 检测模块 801还可以设置为周期性对音频媒体流进行声音激活 检测。 在一个实施例中, 如图 9所示, 获取模块 802可以包括: 第一确定子模块 901, 设置为若音频媒体流的声音参数高于声音激活检测的门限 值时, 确定检测结果为声音激活状态; 第二确定子模块 902, 设置为若音频媒体流的声音参数不高于声音激活检测的门 限值时, 确定检测结果为声音未激活状态。 其中, 第一确定子模块 901与第二确定子模块 902是两个并列的功能模块, 分别 与检测模块 801耦合。 在一个实施例中, 叠加模块 803还可以设置为在发送给终端的视频信号中叠加文 字或图标, 文字或图标用于指示终端被静音。 在一个实施例中, 叠加模块 803还可以设置为在发送给终端的每个视频帧进行叠 加静音视频指示的重复处理, 直至取消静音视频指示。 从以上的描述中, 可以看出, 本发明实现了如下技术效果: 在本发明实施例中, MCU对参与视频会议且已被静音的终端发送的音频媒体流进 行声音激活检测, 当检测结果为声音激活状态时, MCU在发送给终端的视频信号中叠 加静音视频指示。 本发明实施例中, 当某个终端被静音后, 如果该终端的用户尝试发 言时, 在接收的视频信号中, 就会显示一个静音视频指示消息, 例如显示 "你当前被 禁止发言,请先进行申请发言操作"。本发明实施例的目的是改善视频会议的沟通体验, 让视频会议使用简单高效。 本发明实施例的优点是, 提升信息直观, 提示信息内容可 以丰富准确, 提示信息是动态出现, 正常情况下没有提示, 确保对用户的干扰最小。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可以用通用 的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布在多个计算装置所 组成的网络上, 可选地, 它们可以用计算装置可执行的程序代码来实现, 从而可以将 它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块, 或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。 这样, 本发明不限 制于任何特定的硬件和软件结合。 以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本领域的技 术人员来说, 本发明可以有各种更改和变化。 凡在本发明的精神和原则之内, 所作的 任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。 Step 514: Send a request overlay prompt message, notify the video overlay module to perform video overlay, and return to step 511 to repeat the execution; Step 515, send a cancel overlay prompt message, notify the video overlay module to cancel the video overlay, and return to step 512 to repeat the execution; Subsequently, the request superimposed prompt message or the cancel superimposed prompt message is output to the video superimposing module, and the subsequent steps 521 to 524 are performed by the video superimposing module; Step 521, the video superimposing module updates the video superimposed state according to the input of the main control module; The video overlay module determines whether to perform video overlay, if step 523 is performed, otherwise step 524 is performed; step 523, the video overlay module superimposes the prompt information into the video signal sent to the terminal, and the prompt information may be an icon or descriptive expression that is silent. The text string; the content of the prompt text, font, text size, color, display position and other attributes can be used as an adjustable configuration item; Step 524, the video overlay module does not perform overlay processing. The mute indication method provided by the embodiment of the present invention can generate a mute prompt in the video. For example, FIG. 6 is a display effect of a mute prompt using a video overlay mode, and an outer rectangular box represents a television screen, and a character icon is used. Indicates the video signal watched by the terminal. The text side at the bottom is the superimposed mute prompt message. For example, if you are currently prohibited from speaking, please apply for a speech first. For another example, FIG. 7 shows the display effect of the mute prompt using the video insertion mode, and the outer rectangular box represents the television screen. Based on the same inventive concept, the embodiment of the present invention further provides a mute indication device applied to a video conference, and the structure thereof is as shown in FIG. 8 , and is disposed in the multi-point conference unit MCU, and includes: a detection module 801, configured to participate in The audio media stream sent by the terminal that has been muted by the video conference is subjected to sound activation detection. The acquisition module 802 is coupled to the detection module 801 and configured to acquire the detection result of the terminal, wherein the detection result includes any one of the following: a sound activation state and The sound is not activated; The superimposing module 803 is coupled to the acquisition module 802, and is configured to superimpose the mute video indication in a video signal sent to the terminal when the detection result is a sound activation state. In one embodiment, the detection module 801 can also be configured to periodically perform sound activation detection on the audio media stream. In an embodiment, as shown in FIG. 9, the obtaining module 802 may include: a first determining submodule 901, configured to determine that the detection result is sound if the sound parameter of the audio media stream is higher than a threshold value of the sound activation detection. The second determination sub-module 902 is configured to determine that the detection result is a sound inactive state if the sound parameter of the audio media stream is not higher than the threshold value of the sound activation detection. The first determining submodule 901 and the second determining submodule 902 are two parallel functional modules, which are respectively coupled to the detecting module 801. In one embodiment, the overlay module 803 can also be configured to superimpose text or icons in the video signal sent to the terminal, the text or icon being used to indicate that the terminal is muted. In one embodiment, the overlay module 803 can also be configured to perform a repetitive process of superimposing the mute video indication on each video frame sent to the terminal until the mute video indication is cancelled. From the above description, it can be seen that the present invention achieves the following technical effects: In the embodiment of the present invention, the MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted, when the detection result is When the sound is activated, the MCU superimposes the mute video indication in the video signal sent to the terminal. In the embodiment of the present invention, when a terminal is muted, if the user of the terminal attempts to speak, a mute video indication message is displayed in the received video signal, for example, "You are currently prohibited from speaking, please first Conduct an application to speak." The purpose of the embodiment of the present invention is to improve the communication experience of the video conference, and to make the video conference use simple and efficient. The advantage of the embodiment of the present invention is that the information is improved intuitively, and the information content of the prompt information can be rich and accurate, and the prompt information is dynamically appeared, and there is no prompt under normal circumstances to ensure minimal interference to the user. Obviously, those skilled in the art should understand that the above modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device so that they may be stored in the storage device by the computing device, or they may be separately fabricated into respective integrated circuit modules. Alternatively, multiple modules or steps of them can be implemented as a single integrated circuit module. Thus, the invention is not limited to any specific combination of hardware and software. The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims

权 利 要 求 书 Claim
1. 一种应用于视频会议的静音指示方法, 包括: 1. A mute indication method applied to a video conference, comprising:
多点会议单元 MCU对参与视频会议且已被静音的终端发送的音频媒体流 进行声音激活检测;  Multi-point conference unit The MCU performs sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted;
所述 MCU获取所述终端的检测结果, 其中, 所述检测结果包括下列任意 之一: 声音激活状态和声音未激活状态;  The MCU acquires a detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivation state;
当所述检测结果为声音激活状态时, 所述 MCU在发送给所述终端的视频 信号中叠加所述静音视频指示。  When the detection result is a sound activated state, the MCU superimposes the silent video indication in a video signal transmitted to the terminal.
2. 根据权利要求 1所述的方法, 其中, 所述 MCU对参与视频会议的终端发送的 音频媒体流进行声音激活检测, 包括: 所述 MCU周期性对所述音频媒体流进 行声音激活检测。 The method according to claim 1, wherein the MCU performs sound activation detection on the audio media stream sent by the terminal participating in the video conference, and the method includes: the MCU periodically performing sound activation detection on the audio media stream.
3. 根据权利要求 1或 2所述的方法, 其中, 所述 MCU获取所述终端的检测结果, 包括: The method according to claim 1 or 2, wherein the acquiring, by the MCU, the detection result of the terminal includes:
若所述音频媒体流的声音参数高于所述声音激活检测的门限值时, 所述 MCU确定所述检测结果为声音激活状态;  And if the sound parameter of the audio media stream is higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound activation state;
若所述音频媒体流的声音参数不高于所述声音激活检测的门限值时, 所述 MCU确定所述检测结果为声音未激活状态。  And if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection, the MCU determines that the detection result is a sound inactive state.
4. 根据权利要求 3所述的方法, 其中, 所述 MCU在发送给所述终端的视频信号 中叠加所述静音视频指示, 包括: 所述 MCU在发送给所述终端的视频信号中 叠加文字或图标, 所述文字或图标用于指示所述终端被静音。 4. The method according to claim 3, wherein the MCU superimposes the mute video indication in a video signal sent to the terminal, comprising: the MCU superimposing text in a video signal sent to the terminal Or an icon, the text or icon is used to indicate that the terminal is muted.
5. 根据权利要求 4所述的方法, 其中, 所述 MCU在发送给所述终端的视频信号 中叠加所述静音视频指示, 包括: 所述 MCU在发送给所述终端的每个视频帧 进行叠加所述静音视频指示的重复处理, 直至取消所述静音视频指示。 5. The method according to claim 4, wherein the MCU superimposes the mute video indication in a video signal sent to the terminal, comprising: the MCU performing each video frame sent to the terminal The iterative process of the mute video indication is superimposed until the mute video indication is cancelled.
6. 一种应用于视频会议的静音指示装置, 设置于多点会议单元 MCU中, 包括: 检测模块, 设置为对参与视频会议且已被静音的终端发送的音频媒体流进 行声音激活检测; 获取模块, 设置为获取所述终端的检测结果, 其中, 所述检测结果包括下 列任意之一: 声音激活状态和声音未激活状态; A mute indicating device applied to a video conference, which is disposed in the MCU of the multipoint conference unit, and includes: a detecting module, configured to perform sound activation detection on the audio media stream sent by the terminal that has participated in the video conference and has been muted; An acquiring module, configured to obtain a detection result of the terminal, where the detection result includes any one of the following: a sound activation state and a sound inactivation state;
叠加模块, 设置为当所述检测结果为声音激活状态时, 在发送给所述终端 的视频信号中叠加所述静音视频指示。  The superimposing module is configured to superimpose the mute video indication in a video signal transmitted to the terminal when the detection result is a sound activated state.
7. 根据权利要求 6所述的装置, 其中, 所述检测模块还设置为周期性对所述音频 媒体流进行声音激活检测。 7. The apparatus according to claim 6, wherein the detecting module is further configured to periodically perform sound activation detection on the audio media stream.
8. 根据权利要求 6或 7所述的装置, 其中, 所述获取模块包括: 第一确定子模块, 设置为若所述音频媒体流的声音参数高于所述声音激活 检测的门限值时, 确定所述检测结果为声音激活状态; The device according to claim 6 or 7, wherein the obtaining module comprises: a first determining submodule, configured to: if a sound parameter of the audio media stream is higher than a threshold value of the sound activation detection Determining that the detection result is a sound activation state;
第二确定子模块, 设置为若所述音频媒体流的声音参数不高于所述声音激 活检测的门限值时, 确定所述检测结果为声音未激活状态。  And a second determining submodule configured to determine that the detection result is a sound inactive state if the sound parameter of the audio media stream is not higher than a threshold value of the sound activation detection.
9. 根据权利要求 8所述的装置, 其中, 所述叠加模块还设置为在发送给所述终端 的视频信号中叠加文字或图标, 所述文字或图标用于指示所述终端被静音。 9. The apparatus according to claim 8, wherein the superimposing module is further configured to superimpose a text or an icon in a video signal sent to the terminal, the text or icon being used to indicate that the terminal is muted.
10. 根据权利要求 9所述的装置, 其中, 所述叠加模块还设置为在发送给所述终端 的每个视频帧进行叠加所述静音视频指示的重复处理, 直至取消所述静音视频 指示。 10. The apparatus of claim 9, wherein the superimposing module is further configured to perform a repetitive process of superimposing the mute video indication on each video frame transmitted to the terminal until the mute video indication is cancelled.
PCT/CN2011/084000 2010-12-16 2011-12-14 Mute indication method and device applied to video conferencing WO2012079510A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2010105916923A CN102025972A (en) 2010-12-16 2010-12-16 Mute indication method and device applied for video conference
CN201010591692.3 2010-12-16

Publications (1)

Publication Number Publication Date
WO2012079510A1 true WO2012079510A1 (en) 2012-06-21

Family

ID=43866746

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/084000 WO2012079510A1 (en) 2010-12-16 2011-12-14 Mute indication method and device applied to video conferencing

Country Status (2)

Country Link
CN (1) CN102025972A (en)
WO (1) WO2012079510A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025972A (en) * 2010-12-16 2011-04-20 中兴通讯股份有限公司 Mute indication method and device applied for video conference
CN103516919B (en) * 2012-06-27 2018-03-27 中兴通讯股份有限公司 Send the method, apparatus and terminal of voice data
CN103595951A (en) * 2012-08-16 2014-02-19 中兴通讯股份有限公司 Audio frequency input state processing method, sending end equipment and receiving end equipment
CN102915743B (en) * 2012-10-12 2014-12-17 华为技术有限公司 Voice prompt playing method and device for conference system
CN110099182A (en) * 2018-01-27 2019-08-06 华为技术有限公司 One kind closing sound reminding method and device
CN111355919B (en) * 2018-12-24 2021-05-25 中移(杭州)信息技术有限公司 Communication session control method and device
CN111343410A (en) * 2020-02-14 2020-06-26 北京字节跳动网络技术有限公司 Mute prompt method and device, electronic equipment and storage medium
CN114449621B (en) * 2020-10-30 2023-03-24 极米科技股份有限公司 Method, device and storage medium for saving electric quantity consumption of multilink terminal
CN113038063B (en) * 2021-03-24 2023-03-03 百度在线网络技术(北京)有限公司 Method, apparatus, device, medium and product for outputting a prompt

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020188731A1 (en) * 2001-05-10 2002-12-12 Sergey Potekhin Control unit for multipoint multimedia/audio system
US20030185371A1 (en) * 2002-03-29 2003-10-02 Dobler Steve R. Mute status reminder for a communication device
CN1525758A (en) * 2003-06-20 2004-09-01 北京中星微电子有限公司 Videoconference audio frequency quality test method
CN101076108A (en) * 2007-06-19 2007-11-21 中兴通讯股份有限公司 Video conference terminal
CN101228810A (en) * 2005-07-27 2008-07-23 欧力天工股份有限公司 Sound system for conference
CN101646057A (en) * 2009-09-07 2010-02-10 深圳华为通信技术有限公司 Remote-presence conference control device, method and remote-presence conference system
CN101710961A (en) * 2009-12-09 2010-05-19 中兴通讯股份有限公司 Control method and device for generating title in video conference
CN102025972A (en) * 2010-12-16 2011-04-20 中兴通讯股份有限公司 Mute indication method and device applied for video conference

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020188731A1 (en) * 2001-05-10 2002-12-12 Sergey Potekhin Control unit for multipoint multimedia/audio system
US20030185371A1 (en) * 2002-03-29 2003-10-02 Dobler Steve R. Mute status reminder for a communication device
CN1525758A (en) * 2003-06-20 2004-09-01 北京中星微电子有限公司 Videoconference audio frequency quality test method
CN101228810A (en) * 2005-07-27 2008-07-23 欧力天工股份有限公司 Sound system for conference
CN101076108A (en) * 2007-06-19 2007-11-21 中兴通讯股份有限公司 Video conference terminal
CN101646057A (en) * 2009-09-07 2010-02-10 深圳华为通信技术有限公司 Remote-presence conference control device, method and remote-presence conference system
CN101710961A (en) * 2009-12-09 2010-05-19 中兴通讯股份有限公司 Control method and device for generating title in video conference
CN102025972A (en) * 2010-12-16 2011-04-20 中兴通讯股份有限公司 Mute indication method and device applied for video conference

Also Published As

Publication number Publication date
CN102025972A (en) 2011-04-20

Similar Documents

Publication Publication Date Title
WO2012079510A1 (en) Mute indication method and device applied to video conferencing
US8456508B2 (en) Audio processing in a multi-participant conference
US8526587B2 (en) Web guided collaborative audio
US9509953B2 (en) Media detection and packet distribution in a multipoint conference
US8149261B2 (en) Integration of audio conference bridge with video multipoint control unit
US20100039498A1 (en) Caption display method, video communication system and device
WO2017129129A1 (en) Instant call method, device, and system
US9900552B2 (en) Conference processing method of third-party application and communication device thereof
US20090316870A1 (en) Devices and Methods for Performing N-Way Mute for N-Way Voice Over Internet Protocol (VOIP) Calls
JP2015532019A (en) User interaction monitoring for adaptive real-time communication
RU2658602C2 (en) Maintaining audio communication in an overloaded communication channel
US9088690B2 (en) Video conference system
US20090325561A1 (en) Method and system for enabling a conference call
CN111641602A (en) Session creation method and device and electronic equipment
WO2009030128A1 (en) A method and media server of obtaining the present active speaker in conference
WO2014005488A1 (en) Video data flow transmission method, terminal and system
JP2009272716A (en) Voip communication system
TWI435589B (en) Voip integrating system and method thereof
WO2014026625A1 (en) Method for processing audio input state, sending-end device and receiving-end device
CN103428468B (en) The method and system of Visual communications are carried out based on mobile terminal and set-top box collaboration
JP2007281600A (en) Content providing system and content switching method
CN111510662B (en) Network call microphone state prompting method and system based on audio and video analysis
JP4644813B2 (en) Multi-party call system, call terminal and call server in multi-party call system, multi-party call method
JP2008227693A (en) Speaker video display control system, speaker video display control method, speaker video display control program, communication terminal, and multipoint video conference system
TWI387301B (en) Method for displaying multiple video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11849270

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11849270

Country of ref document: EP

Kind code of ref document: A1