cardionet-v2 / README.md
yeamerci's picture
Update README.md
dc90ec8
---
license: cc-by-nc-4.0
---
### <i>CardioNetV2</i>
**The latest multi-modal model in the Cardio Sonix line.
Built on the basis of models whose architectures were originally
intended for computer vision tasks
(like a modified ResNet) or for NLP (like LSTM).
The model works with audio signal and tabular data.
The model works with the input audio signal as with tokens:
a mel-kesprogram with time samples is extracted from the audio,
where each time sample has N-mel-cepstral coefficients.
At the very beginning, the LSTM takes a mel-cepstrogram as input
and produces an output tensor that goes into ResNet (Residual Neural Network).
ResNet is a modified audio signal processing model from the family of residual networks.
In this implementation, residual blocks with pre-activation were used.
The data then goes to the DenseMixer input.
This model performs inference separately for audio and tabular features,
then concatenates the outputs into a dense feature vector and performs inference on it,
after which we get a prediction based on audio and tabular data**
![](https://i.ibb.co/gW14Dh2/attached.png)