Friday, March 23, 2018

转自：http://blog.csdn.net/huchad/article/details/52092796

使用kaldi的DNN做音频分类，异常声音检测。

HMM/GMM -》 HMM/DNN

基本上沿用语音识别的思路，有两点注意一下即可。

1. 在训HMM/GMM时，训到monophone即可，使用monophone的HMM与alignment来训DNN

2.语言模型的准备，手动构造一个一元的简单模型即可

DNN的主要训练步骤如下：

#Step 1. Pre-train DBN

steps/nnet/pretrain_dbn.sh

--cmvn-opts "--norm-means=true --norm-vars=true" // 均值方差归一化

--delta-opts "--delta-order=2"// 差分特征

--splice 5 拼接帧数

--nn_depth 3 // 隐含层的个数

--hid-dim 256// 隐层节点数

--rbm-iter 8 // 迭代次数

$train $dir

# Step2:Train the DNN optimizing per-frame cross-entropy

steps/nnet/train.sh

--feature-transform $feature_transform

--dbn $dbn // step1 所得到的dbn

--hid-layers 0 // 表示使用dbn的隐层

--learn-rate 0.008 // 学习率
${train}_tr90 ${train}_cv10 data/lang $ali $ali $dir

# step3： generate lattices and alignments for sMBR:
steps/nnet/align.sh --nj 20 --cmd "$train_cmd" $train data/lang $srcdir ${srcdir}_ali

steps/nnet/make_denlats.sh --nj 20 --cmd "$decode_cmd" --config conf/decode_dnn.config --acwt $acwt \
$train data/lang $srcdir ${srcdir}_denlats

#step4：Re-train the DNN by iterations of sMBR

steps/nnet/train_mpe.sh

--cmd "$cuda_cmd" --num-iters 6 --acwt $acwt --do-smbr true \
$train data/lang $srcdir ${srcdir}_ali ${srcdir}_denlats $dir

'via Blog this'

Monday, March 19, 2018

Will Trump abolish the H1B visa?

The Office of the President of the US is one of the most powerful offices in the world. There are many things President Trump can do, by the powers vested in him by us, the people of the United States. Abolishing the H1B visa is a long-winded legislative process, which the President can initiate, technically speaking, but largely has an unpredictable outcome. However in the short term, the President can make it very restrictive for H1B workers through Executive Orders for renewals and benefits. What will be the impact of such actions? Unfortunately everything in life cannot be statistically modeled or simulated, but there are some common-sense fait-accompli consequences of a drastic action like abolishing H1B:

The US economy has grown highest from 1950s till 1990s and has become a global superpower because of technology and innovation. Immigration of highly skilled workers has contributed largely to the US economy (US has been the most preferred destination for tech workers - principle: excellence breeds excellence). Tech startups and companies in the US have become worldwide successes, and that in turn attracts brighter minds to immigrate here. They have not only contributed through employment but they have founded companies that have enhanced the livelihood of millions. Cases in point: Elon Musk, Sergey Brin, Shahid Khan, Steve Chen, Jawed Karim, Vinod Khosla, etc. Protectionist laws will only turn away future great minds from creating new products, innovations and technology breakthroughs that is key to pushing the country forward.
H1B visa holders constitute approximately 1 million workers (less than 0.3% of our population), who, research has pointed out, contribute positively to our GDP, pay into our tax system, medicare / medicaid, and create highly educated future generation (Gen 0 or 1). The economic impact of abolishing will be extreme: there are few highly skilled American tech workers, and they will now come at a premium (all things remaining constant, that there is no provision made to enhance tech skills for the masses). Small businesses and corporates surviving on thin margins will cease to be profitable and will have to go out of business. Or they will have to move corporate HQs abroad or outsource. If this decision of abolition of H1B was a corporate decision, the statistical comparison of an upside / downside to this extreme action makes it unworthy of any consideration. I daresay there’re bigger problems we face today.
While there are many benefits of H1B visa for the US economy, the H1B system certainly does need reform to ensure higher levels of productivity and reduce misuse. Some are detailed below;

Vet credentials and achievements of highly skilled workers, and the gaps that they fill, before granting H1Bs in the first place, so we know the best and brightest are coming to the US.
Immigration: Today the US allows many different routes for immigration such as family, investment, diversity and highly skilled workers (H1B). H1B visas are “dual-intent” visas: the holders can apply for Permanent Residence. If highly skilled tech workers is what is targeted, why restrict country-wise? Let all nationalities compete evenly on the basis of SKILLS. Why not make it a level playing field? While we want the best and the ‘cream of the crop’ let the toughest and brightest get in.
Minimum wage: I do believe in free markets. However, in the interest of fewer applications to process on the already overloaded USCIS, adding a minimum threshold of $100,000 or like can ensure senior and skilled workers are prioritized for hiring.
Grant EADs to I-140s, making job-mobility easier so the onus is on the companies to hire the best talent, pay them market rates and look towards increasing overall productivity. If mobility of workforce is ensured, then companies will naturally hire more qualified Americans (benefit: less paperwork). This will also stop companies from holding H1B workers hostage by paying them cheaper due to non-mobility, and the exploitation by consulting and outsourcing companies. What happened to the Republican principles of free enterprise and less-Government anyway?
US Masters candidates: Why not? I agree there are plenty of universities around the world that are excellent. If we are going to hire skilled workers from everywhere, why can’t Masters degree holders from the US get an extra brownie point? Atleast US universities will benefit, and it will contribute to the GDP.
Create ongoing training program for American citizens so they can keep skills up-to-date to compete on a global platform. Perhaps Fed and State Governments can have a skills-upgradation program for unemployed workers, or perhaps some of our spending can be directed towards coal-workers and blue-collar workers who have lost their jobs due to automation, to go to community colleges to train to become technicians and laboratory workers, etc. - you get the idea. How about some positive reinforcement to move forward the American economy?

Most actions can always be reversed, such as immigration rules, industrial waste dumping policies or financial safety provisions. However some repercussions may be permanent and may have lasting impact: such as the reversal of the US from a global superpower into a has-been economy. Good reason to get fully into details and analyze all angles before a severe action like H1B abolition.

kaldi中的Vector和Matrix

Link:

http://blog.csdn.net/u013677156/article/details/79202271

kaldi中的Vector和Matrix

Vector和Matrix是kaldi中最常用的数据类型之一。语音数据，提取的特征，计算的结果，都保存在Vector或者Matrix之中。按照字面意思，Vector是“向量”，它只有一行数据，是一维的。Matrix是“矩阵”，它有行与列两个维度。kaldi中的Vector和Matrix，可以做许多数学上的操作。比如点加或点乘（每个元素都加上一个数，或者乘以一个数），比如矩阵之间的乘法和矩阵的奇异分解等。kaldi中Vector和Matrix还可以做一些特殊操作，比如对每个元素取对数，对所有元素做softmax等。

一、首先介绍下Vector。
在matrix/kaldi-vector.h中，定义了三个类：VectorBase、Vector和SubVector。其中，VectorBase是基类（父类），Vector和SubVector是派生类（子类）。VectorBase中的成员函数已经可以完成一个向量类的所有操作了，Vector类只是做了封装，定义了多种形式的构造函数，增加了resize操作等。
VectorBase类中的数据成员十分简单，就两个成员。一个指针data_指向存放数据的内存，一个整数dim_指示元素的个数。
VectorBase中的函数成员比较多，但基本可以分为两类。一类是基本的、简单的操作。例如SetZero函数，用以设置全部数据为0；例如max函数，返回向量中的最大值。另一类是偏应用的函数或操作。比如，ApplySoftMax函数，提供softmax操作；比如Norm函数，计算范数。

下面的代码，有利于理解VectorBase的各种性质。注意，为了方面阅读和理解，对源代码做了修改。

[cpp] view plaincopy
template<typename Real>  
class VectorBase {  
  // *******数据成员，data_表示内存地址，dim_表示元素个数*******  
  Real* data_;  
  MatrixIndexT dim_;  
  explicit VectorBase(): data_(NULL), dim_(0) { }  
    
  // **************第一类，比较基础和简单的函数*******************  
  // set类函数，设置全部值为0、特定值或者某种分布的随机值  
  void SetZero();  
  void Set(Real f);  
  void SetRandn();  
  void SetRandUniform();  
  
  // 返回元素个数dim_，返回元素地址data_，返回占用内存大小，重载()操作等  
  inline MatrixIndexT Dim() const { return dim_; }  
  inline Real* Data() { return data_; }  
  inline MatrixIndexT SizeInBytes() const { return (dim_*sizeof(Real)); }  
  inline Real operator() (MatrixIndexT i) const { return *(data_ + i);  }  
  inline Real & operator() (MatrixIndexT i) {   return *(data_ + i);  }  
  Real Max() const;  
  Real Min() const;  
  Real Sum() const;  
  Real SumLog() const;  
    
  // 拷贝向量或者矩阵(全部或者局部，例如一行)的内容，来作为data_  
  void CopyFromVec(const VectorBase<Real> &v);  
  void CopyFromPacked(const PackedMatrix<Real> &M);  
  void CopyRowsFromMat(const MatrixBase<Real> &M);  
  void CopyColsFromMat(const MatrixBase<Real> &M);  
  void CopyRowFromMat(const MatrixBase<Real> &M, MatrixIndexT row);  
  void CopyDiagFromMat(const MatrixBase<Real> &M);  
    
  // **************第二类，偏应用的操作和函数*******************  
  void Add(Real c);      /// data_[i] += c;  
  void Scale(Real c);    /// data_[i] *= c; cblas_Xscal(dim_, c, data_, 1);  
  void ApplyLog();       /// data_[i] = Log(data_[i])  
  void ApplyExp();       /// data_[i] = Exp(data_[i])  
  void ApplyAbs();       /// data_[i] = abs(data_[i])  
  void InvertElements(); /// data_[i] = 1 / data_[i]  
  void ApplyPow(Real power);  // 求指数  
  Real Norm(Real p) const;    // 求p阶范数  
  void MulElements(const VectorBase<Real> &v); //data_[i] *= v.data_[i];  
  void DivElements(const VectorBase<Real> &v); //data_[i] /= v.data_[i];  
  
  //各种形式的矩阵操作，一般调用BLAS，例如 AddVec: *this = *this + alpha * rv   
  void AddVec(const Real alpha, const VectorBase<Real> &v);  
  void AddVec2(const Real alpha, const VectorBase<Real> &v); // rv^2  
  void AddMatVec(...);  //  this <-- beta*this + alpha*M*v.  
  void AddSpVec(...)    //  this <-- beta*this + alpha*M*v.  
  void AddTpVec(...)    //  this <-- beta*this + alpha*M*v.  
  void AddVecVec(...);  //  this <-- alpha * v .* r + beta*this .  
  void AddVecDivVec(...);// this <---- alpha*v/r + beta*this  
  void MulTp(...);      //  *this <-- *this *M  
    
  //使用softmax: \f$ x(i) = exp(x(i)) / \sum_i exp(x(i)) \f$  
  Real ApplySoftMax(){  
    Real max = this->Max(), sum = 0.0;  
    for (MatrixIndexT i = 0; i < dim_; i++)  
      sum += (data_[i] = Exp(data_[i] - max));  
    this->Scale(1.0 / sum);  
    return max + Log(sum);  
  }  
  void Tanh(const VectorBase<Real> &src);  
  void Sigmoid(const VectorBase<Real> &src);  
}; // class VectorBase  
  
  
template<typename Real>  
class Vector: public VectorBase<Real> {  
 public:  
  // 各种构造函数和赋值操作。  
  Vector(): VectorBase<Real>() {}  
  explicit Vector(const MatrixIndexT s, MatrixResizeType resize_type)  
      : VectorBase<Real>() {  Resize(s, resize_type);  }  
  Vector(const Vector<Real> &v) : VectorBase<Real>()  {   
    Resize(v.Dim(), kUndefined);  
    this->CopyFromVec(v);  }  
  explicit Vector(const VectorBase<Real> &v) : VectorBase<Real>() {  
    Resize(v.Dim(), kUndefined);  
    this->CopyFromVec(v);  }  
  Vector<Real> &operator = (const Vector<Real> &other) {  
    Resize(other.Dim(), kUndefined);  
    this->CopyFromVec(other);  
    return *this; }  
  
  // 新增的Swap、Resize和RemoveElement操作  
  void Swap(Vector<Real> *other);  
  void Resize(MatrixIndexT length, MatrixResizeType resize_type = kSetZero);  
  void RemoveElement(MatrixIndexT i);  
  
 private:  
  void Init(const MatrixIndexT dim);  
  void Destroy();  
};  
  
template<typename Real>  
class SubVector : public VectorBase<Real> {  
 public:  
  //SubVector不分配内存，它使用其他VectorBase的数据，可以看作是“引用”。  
  // 下面是各种版本的构造函数。  
  SubVector(const VectorBase<Real> &t, const MatrixIndexT origin,  
            const MatrixIndexT length) : VectorBase<Real>() {  
     VectorBase<Real>::data_ = const_cast<Real*> (t.Data()+origin);  
    VectorBase<Real>::dim_   = length;  
  }  
  SubVector(const PackedMatrix<Real> &M) {  
    VectorBase<Real>::data_ = const_cast<Real*> (M.Data());  
    VectorBase<Real>::dim_   = (M.NumRows()*(M.NumRows()+1))/2;  
  }  
  SubVector(const SubVector &other) : VectorBase<Real> () {// Copy constructor  
    VectorBase<Real>::data_ = other.data_;  
    VectorBase<Real>::dim_ = other.dim_;  
  }  
  SubVector(Real *data, MatrixIndexT length) : VectorBase<Real> () {  
    VectorBase<Real>::data_ = data;  
    VectorBase<Real>::dim_   = length;  
  }  
  SubVector(const MatrixBase<Real> &matrix, MatrixIndexT row) {  
    VectorBase<Real>::data_ = const_cast<Real*>(matrix.RowData(row));  
    VectorBase<Real>::dim_   = matrix.NumCols();  
  }  
  ~SubVector() {}  ///< Destructor (does nothing; no pointers are owned here).  
  
 private:  
  /// Disallow assignment operator.  
  SubVector & operator = (const SubVector &other) {}  
};  

通过上面的代码，我们可以看出，Vector对VectorBase并未做太多的扩展，它们的功能基本一样。SubVector可以看作一种“引用”，它自身并不分配内存保存数据，而是指向了其他的对象中的数据。

二、简单介绍下Matrix。
跟Vector类似，在在matrix/kaldi-matrix.h中，定义了三个类：MatrixBase、Matrix和SubMatrix。MatrixBase是基类，另外两个是派生类。MatrixBase已经实现了非常多的方法。Matrix只是在基类的基础上，加了少数几个函数，比如Swap和RemoveRow等，这点跟Vector与VectorBase的关系一样。
MatrixBase中，数据成员并不多，大部分也容易理解。比如，整数num_rows_和num_cols_表示矩阵的行数和列数，指针data_指向保存数据的内存地址。这里有另外一个整型变量stride_需要注意。stride_保存的是正真的一行的个数。这里的意思是，一个矩阵，一行可能可以存放许多数据（stride_个），但可以不放满，只放num_cols_个。这时，一部分空间是浪费的。当然，一般部分情况下，num_cols_和stride_是一致的。
在矩阵上面的操作要比向量上的操作多，所以Matrix中的成员函数比Vector中的多很多。

[cpp] view plaincopy
template<typename Real>  
class MatrixBase {  
  //***************数据成员********************  
  Real*   data_;             // data memory area  
  MatrixIndexT  num_cols_;   // < Number of columns  
  MatrixIndexT  num_rows_;   // < Number of rows  
  MatrixIndexT  stride_;     // True number of columns   
    
  // 基本操作函数  
  inline MatrixIndexT NumRows() const { return num_rows_; }  
  inline MatrixIndexT NumCols() const { return num_cols_; }  
  inline MatrixIndexT Stride() const {  return stride_; }  
  inline Real* Data() const { return data_;  }  
  inline Real* RowData(MatrixIndexT i) { return data_ + i * stride_;  }  
  inline Real&  operator() ( r,  c) {return *(data_ + r * stride_ + c);  }  
  size_t SizeInBytes() const {return num_rows_ * stride_ * sizeof(Real);}  
  Real &Index (MatrixIndexT r, MatrixIndexT c) {  return (*this)(r, c); }  
  
  // set、max、min等函数，省略若干  
  void SetZero();  
  void Set(Real);  
  Real Sum() const;  
  Real Max() const;  
  Real Min() const;  
  bool IsZero(Real cutoff = 1.0e-05) const;  
  
  //Copy、SubVector、SubMatrix类函数，很多版本  
  void CopyFromMat(const CompressedMatrix &M);  
  void CopyRowsFromVec(const VectorBase<Real> &v);  
  void CopyDiagFromVec(const VectorBase<Real> &v);  
  inline SubVector<Real> Row(MatrixIndexT i);  
  inline SubMatrix<Real> Range(...);  
  
  // 一些加减乘除操作，其他应用操作  
  void MulElements(const MatrixBase<Real> &A);  
  void DivElements(const MatrixBase<Real> &A);  
  void Scale(Real alpha);  
  void Max(const MatrixBase<Real> &A);  
  void Min(const MatrixBase<Real> &A);  
  void MulColsVec(const VectorBase<Real> &scale);  
  void MulRowsVec(const VectorBase<Real> &scale);  
  void Add(const Real alpha);  
  void ApplyFloor(Real floor_val);  
  void ApplyCeiling(Real ceiling_val);  
  void ApplyLog();  
  void ApplyExp();  
  Real ApplySoftMax();  
  void Sigmoid(const MatrixBase<Real> &src);  
  
  // 求正定矩阵、求逆；转置；特征分解；奇异值分解;矩阵运算  
  Real LogDet(Real *det_sign = NULL) const;  
  void Invert(Real *log_det = NULL, Real *det_sign = NULL,  
              bool inverse_needed = true);  
  void Transpose();  
  void Eig(MatrixBase<Real> *P,  
           VectorBase<Real> *eigs_real,  
           VectorBase<Real> *eigs_imag) const;  
  void Svd(VectorBase<Real> *s, MatrixBase<Real> *U,  
           MatrixBase<Real> *Vt) const;  
  void AddVecVec(...) //*this += alpha * a * b^T  
  void AddMat(...) //*this += alpha * M  
  void AddMatMatMat(...) //this <-- beta*this + alpha*A*B*C.  
  void AddTpTp(...) //this <-- beta*this + alpha*A*B.  
};  

'via Blog this'

asrman

Blog Archive

Friday, March 23, 2018

异常声音检测之kaldi DNN 训练

[转]异常声音检测之kaldi DNN 训练

Monday, March 19, 2018

Will Trump abolish the H1B visa?

kaldi中的Vector和Matrix