This document provides a detailed overview of the design and architecture of the Matter camera application, focusing on the Linux implementation.
The camera application is designed with a clear separation between the generic Matter cluster logic and the platform-specific hardware abstraction. This is achieved through the use of a CameraDeviceInterface, which defines the contract that any platform-specific camera implementation must adhere to.
The core components are:
CameraApp): Responsible for initializing and managing the Matter clusters related to camera functionality. It is platform-agnostic and interacts with the hardware through the CameraDeviceInterface.CameraDevice): The Linux-specific implementation of the CameraDeviceInterface. It manages the camera hardware, using V4L2 and GStreamer, and provides the necessary delegates for the Matter clusters.DefaultMediaController): The central hub for media data distribution. It receives encoded media frames from the CameraDevice and distributes them to the various transport managers.WebRTCProviderManager, PushAvStreamTransportManager): These classes manage the specific transport protocols for streaming media to clients.CameraAVStreamManagement, WebRTCTransportProvider).This layered architecture allows for easy porting of the camera application to other platforms by simply providing a new implementation of the CameraDeviceInterface.
CameraAppCameraAVStreamManagementCluster, WebRTCTransportProviderCluster, ChimeServer).CameraDeviceInterface to configure the clusters.CameraDeviceInterface pointer in its constructor.CameraDeviceInterface to get delegates and hardware information.CameraDeviceInterfaceCameraHALInterface, an inner interface that abstracts the hardware-specific operations (e.g., starting/stopping streams, taking snapshots).CameraDeviceCameraDeviceInterface and CameraDeviceInterface::CameraHALInterface for the Linux platform.CameraAVStreamManager, WebRTCProviderManager).DefaultMediaController and passes the encoded media frames to it.CameraHALInterface implementation to the CameraAVStreamManager.main.cpp and passed to CameraAppInit.Get...Delegate() methods to the CameraApp.CameraHALInterface methods (like StartVideoStream, StopVideoStream) are called by CameraAVStreamManager.DefaultMediaControllerCameraDevice's GStreamer pipeline callbacks.PushAVPreRollBuffer) that stores a configurable duration of recent media frames. This is crucial for event-based recording, as it allows the recording to include footage from before the event occurred.WebRTCTransport, PushAVTransport).CameraDevice.WebRTCProviderManager and PushAvStreamTransportManager register their transport instances with the MediaController.The manager classes in the camera-app are concrete implementations of the delegate interfaces defined in the Matter SDK. They act as a bridge between the generic cluster logic in the SDK and the specific hardware implementation in the camera-app.
CameraAVStreamManager:
chip::app::Clusters::CameraAvStreamManagement::CameraAVStreamManagementDelegateVideoStreamAllocate, AudioStreamAllocate, etc.).CameraDeviceHAL.CameraDevice (specifically its CameraHALInterface implementation) to start and stop the GStreamer pipelines for the various streams (e.g., calling StartVideoStream, StopVideoStream).CameraAVStreamManagementCluster in the SDK calls the methods of the CameraAVStreamManager to handle the commands it receives from the Matter network. For example, when a VideoStreamAllocate command is received, the CameraAVStreamManagementCluster calls the VideoStreamAllocate method on the CameraAVStreamManager.CameraDevice: It holds a CameraDeviceInterface * mCameraDeviceHAL pointer, set via SetCameraDeviceHAL. When a stream needs to be started, stopped, or modified, CameraAVStreamManager calls the appropriate method on mCameraDeviceHAL->GetCameraHALInterface(), e.g., mCameraDeviceHAL->GetCameraHALInterface().StartVideoStream(allocatedStream).WebRTCProviderManager:
chip::app::Clusters::WebRTCTransportProvider::DelegateHandleSolicitOffer: When a client wants the camera to initiate the WebRTC handshake, this method creates a WebrtcTransport, generates an SDP Offer, and sends it to the client.HandleProvideOffer: When a client initiates the handshake, this method processes the received SDP Offer, creates a WebrtcTransport, generates an SDP Answer, and sends it back.OnLocalDescription to send the locally generated SDP, and HandleProvideICECandidates to process candidates from the client.OnConnectionStateChanged(Connected)), it registers the WebrtcTransport with the DefaultMediaController to receive and send audio/video frames.CameraAVStreamManager using AcquireAudioVideoStreams and ReleaseAudioVideoStreams.LiveStreamPrivacyModeChanged to end sessions if live stream privacy is enabled.PushAvStreamTransportManager:
chip::app::Clusters::PushAvStreamTransport::DelegateAllocatePushTransport creates a PushAVTransport object for a given client request. This object is configured with details like container type (CMAF), segment duration, and target streams.PushAVTransport is registered with the DefaultMediaController to access media data, including the pre-roll buffer.ManuallyTriggerTransport: Allows a client to force a recording.HandleZoneTrigger: Called by CameraDevice when a motion zone alarm is raised. This checks which PushAVTransport instances are configured for that zone and initiates the recording and upload process.ValidateBandwidthLimit).ChimeManager:
chip::app::Clusters::Chime::DelegateGetChimeSoundByIndex).PlayChimeSound command handler checks if the chime is enabled and logs the intent to play the selected sound. The current Linux example does not include actual audio playback for chimes.ChimeServer to get the enabled state and selected chime ID.ZoneManager:
chip::app::Clusters::ZoneManagement::DelegateCameraDevice HAL (e.g., OnZoneTriggeredEvent).ZoneTriggered and ZoneStopped Matter events to subscribers.CreateTrigger, UpdateTrigger, RemoveTrigger commands delegate to the CameraHALInterface.CameraDevice calls OnZoneTriggeredEvent on this manager when the HAL detects activity in a zone.PushAvStreamTransportManager is notified of zone triggers to start recordings.CameraAVSettingsUserLevelManager:
chip::app::Clusters::CameraAvSettingsUserLevelManagement::DelegateMPTZSetPosition, MPTZRelativeMove, and MPTZMoveToPreset are received from the SDK cluster.CameraHALInterface (CameraDevice) to interact with the physical hardware (simulated in this app).DPTZSetViewport: Sets the digital viewport for a specific allocated video stream ID. It validates the requested viewport against the stream‘s resolution, aspect ratio, and the camera sensor’s capabilities. The change is applied via CameraHALInterface::SetViewport.DPTZRelativeMove: Adjusts the current viewport of a specific video stream by a delta. Calculations are done to keep the viewport within bounds and maintain the aspect ratio.graph TD subgraph "Platform Agnostic" CameraApp CameraDeviceInterface end subgraph "Linux Platform" CameraDevice -- implements --> CameraDeviceInterface CameraDevice -- owns --> GStreamer CameraDevice -- uses --> V4L2 CameraDevice -- owns --> DefaultMediaController CameraDevice -- owns --> CameraAVStreamManager CameraDevice -- owns --> WebRTCProviderManager CameraDevice -- owns --> PushAvStreamTransportManager CameraDevice -- owns --> ChimeManager CameraDevice -- owns --> ZoneManager WebRTCProviderManager --> DefaultMediaController PushAvStreamTransportManager --> DefaultMediaController end subgraph "Matter Clusters (SDK)" CameraAVStreamManagementCluster WebRTCTransportProviderCluster PushAvStreamTransportServer ChimeServer ZoneMgmtServer end Main --> CameraDevice Main --> CameraApp CameraApp -- aggregates --> CameraDeviceInterface CameraApp --> CameraAVStreamManagementCluster CameraApp --> WebRTCTransportProviderCluster CameraApp --> PushAvStreamTransportServer CameraApp --> ChimeServer CameraApp --> ZoneMgmtServer CameraAVStreamManagementCluster -- uses delegate --> CameraAVStreamManager WebRTCTransportProviderCluster -- uses delegate --> WebRTCProviderManager PushAvStreamTransportServer -- uses delegate --> PushAvStreamTransportManager ChimeServer -- uses delegate --> ChimeManager ZoneMgmtServer -- uses delegate --> ZoneManager CameraAVStreamManager -- calls HAL --> CameraDevice ZoneManager -- calls HAL --> CameraDevice
sequenceDiagram participant Client participant CameraAVStreamManagementCluster participant CameraAVStreamManager as Delegate participant CameraDevice as HAL Client ->> CameraAVStreamManagementCluster: VideoStreamAllocate Request CameraAVStreamManagementCluster ->> Delegate: VideoStreamAllocate() Delegate ->> HAL: GetAvailableVideoStreams() HAL -->> Delegate: List of streams Delegate ->> Delegate: Find compatible stream Delegate ->> HAL: IsResourceAvailable() HAL -->> Delegate: Yes/No alt Resources Available Delegate -->> CameraAVStreamManagementCluster: Success, streamID CameraAVStreamManagementCluster ->> Delegate: OnVideoStreamAllocated() Delegate ->> HAL: StartVideoStream(streamID) HAL ->> HAL: Configure & Start GStreamer Pipeline else Resources NOT Available Delegate -->> CameraAVStreamManagementCluster: ResourceExhausted end CameraAVStreamManagementCluster -->> Client: VideoStreamAllocate Response
sequenceDiagram participant Client participant PushAVStreamTransportServer as SDK Cluster participant PushAvStreamTransportManager as Delegate participant MediaController participant CameraAVStreamManager Client ->> SDK Cluster: AllocatePushTransport Request SDK Cluster ->> Delegate: AllocatePushTransport() Delegate ->> CameraAVStreamManager: GetBandwidthForStreams() CameraAVStreamManager -->> Delegate: Bandwidth Delegate ->> Delegate: ValidateBandwidthLimit() alt Bandwidth OK Delegate ->> Delegate: Create PushAVTransport instance Delegate ->> MediaController: RegisterTransport(PushAVTransport, videoID, audioID) MediaController ->> MediaController: Add to transport list Delegate ->> MediaController: SetPreRollLength() Delegate -->> SDK Cluster: Success else Bandwidth Exceeded Delegate -->> SDK Cluster: ResourceExhausted end SDK Cluster -->> Client: AllocatePushTransport Response
sequenceDiagram participant Client participant WebRTCTransportProviderCluster as SDK Cluster participant WebRTCProviderManager as Delegate participant WebrtcTransport participant MediaController participant CameraAVStreamManager Client ->> SDK Cluster: ProvideOffer Request (SDP Offer) SDK Cluster ->> Delegate: HandleProvideOffer() Delegate ->> Delegate: Create WebrtcTransport Delegate ->> WebrtcTransport: SetRemoteDescription(Offer) Delegate ->> CameraAVStreamManager: AcquireAudioVideoStreams() CameraAVStreamManager -->> Delegate: Success Delegate ->> WebrtcTransport: CreateAnswer() WebrtcTransport -->> Delegate: OnLocalDescription(SDP Answer) Delegate ->> Delegate: ScheduleAnswerSend() SDK Cluster -->> Client: ProvideOffer Response Delegate -->> Client: Answer Command (SDP Answer) Client ->> SDK Cluster: ProvideICECandidates Request SDK Cluster ->> Delegate: HandleProvideICECandidates() Delegate ->> WebrtcTransport: AddRemoteCandidate() Note right of Delegate: Meanwhile... WebrtcTransport -->> Delegate: OnICECandidate (Local Candidates) Delegate ->> Delegate: ScheduleICECandidatesSend() Delegate -->> Client: ICECandidates Command Note over Client, WebrtcTransport: ICE Connectivity Establishment WebrtcTransport -->> Delegate: OnConnectionStateChanged(Connected) Delegate ->> MediaController: RegisterTransport(WebrtcTransport, videoID, audioID) Note over Client, MediaController: Live Stream Starts
graph TD subgraph "Camera Hardware" CameraSensor end subgraph "Linux Kernel" V4L2_Driver end subgraph "Userspace (Camera App)" GStreamer_Pipeline CameraDevice DefaultMediaController WebRTCTransport PushAVTransport end subgraph "Matter Network" ClientDevice end CameraSensor --> V4L2_Driver V4L2_Driver --> GStreamer_Pipeline GStreamer_Pipeline -- Encoded Data --> CameraDevice CameraDevice -- Encoded Data (H.264/Opus) --> DefaultMediaController DefaultMediaController --> WebRTCTransport DefaultMediaController --> PushAVTransport WebRTCTransport --> ClientDevice PushAVTransport --> ClientDevice
CameraAVStreamManagement cluster.CameraAVStreamManagementCluster (in the SDK) receives the request and calls the VideoStreamAllocate method on its delegate, the CameraAVStreamManager.CameraAVStreamManager validates the request, checks for compatible stream configurations and available resources by querying the CameraDevice (HAL).CameraAVStreamManager updates the stream state.CameraAVStreamManager via OnVideoStreamAllocated.CameraAVStreamManager calls StartVideoStream on the CameraDevice's CameraHALInterface.CameraDevice creates and starts a GStreamer pipeline to handle the video stream. The pipeline is configured to:v4l2src).videoconvert).x264enc).appsink.appsink has a callback function (OnNewVideoSampleFromAppSink) that is called for each new frame.DefaultMediaController.DefaultMediaController pushes the frame to its pre-roll buffer, which then distributes it to all registered transports.WebRTCTransport receives the frame and sends it over the established WebRTC connection to the client.PushAVTransport receives the frame and includes it in the recording that is pushed to the client.CameraAVStreamManagement cluster.CameraAVStreamManagementCluster receives the request and calls the CaptureSnapshot method on the CameraAVStreamManager.CameraAVStreamManager delegates the call to CameraDevice::CaptureSnapshot.CameraDevice (if a snapshot stream is not running) creates a GStreamer pipeline to capture a single frame. The pipeline is configured to:v4l2src or libcamerasrc).jpegenc).multifilesink).CameraDevice then reads the JPEG file from disk and sends the data back to the client as the response to the snapshot request.GStreamer is used extensively in the CameraDevice to handle all media processing. The CameraDevice class contains helper methods (CreateVideoPipeline, CreateAudioPipeline, CreateSnapshotPipeline, CreateAudioPlaybackPipeline) that construct these GStreamer pipelines. Pipelines are dynamically created, started, and stopped based on requests from the Matter clusters, as orchestrated by the various managers (especially CameraAVStreamManager).
Video Streaming (CreateVideoPipeline):
v4l2src (or videotestsrc for testing) to capture from the camera device.capsfilter to set resolution and framerate.videoconvert to ensure the format is suitable for the encoder (e.g., I420).x264enc for H.264 encoding.appsink with the OnNewVideoSampleFromAppSink callback. This callback receives the encoded H.264 buffers and passes them to DefaultMediaController::DistributeVideo.CameraDevice::StartVideoStream, stopped by CameraDevice::StopVideoStream.Audio Streaming (CreateAudioPipeline):
pulsesrc (or audiotestsrc for testing).capsfilter to set sample rate, channels.audioconvert and audioresample.opusenc for Opus encoding.appsink with the OnNewAudioSampleFromAppSink callback, which passes encoded Opus buffers to DefaultMediaController::DistributeAudio.CameraDevice::StartAudioStream, stopped by CameraDevice::StopAudioStream.Snapshots (CreateSnapshotPipeline):
v4l2src or libcamerasrc depending on camera type.capsfilter for resolution/format.jpegenc to create a JPEG image.multifilesink to save the JPEG to a temporary file (SNAPSHOT_FILE_PATH). The CameraDevice::CaptureSnapshot method then reads this file.CameraDevice::CaptureSnapshot is called and a snapshot stream is active.Audio Playback (CreateAudioPlaybackPipeline):
udpsrc to receive RTP Opus packets from the network (e.g., from a WebRTC session).rtpjitterbuffer to handle network jitter.rtpopusdepay to extract Opus frames from RTP.opusdec to decode Opus audio.audioconvert, audioresample, and autoaudiosink to play the audio on the device speakers.CameraDevice::StartAudioPlaybackStream, stopped by CameraDevice::StopAudioPlaybackStream.The state of these pipelines (e.g., NULL, READY, PLAYING) is managed using gst_element_set_state. Callbacks on appsink elements are crucial for linking GStreamer's data flow to the Matter application logic.