Header Image

Jay Kim

Senior Data Scientist at SoCalGas

Highly efficient and results-oriented data scientist with strong quantitative skills, development experience and strong educational background with an MSc (Imperial College London (World Rank within Top 10 QS)). Responsible self-starter with demonstrated experience in the statistical programming language (R, Python, SAS, Scala) and programming language python for API. High ability holder on visualization with tools such as Tableau as well as a good understanding of relational databases such as SQL and oracle and non-relational databases such as HBase, MongoDB, and Redis. Machine learning tools such as Hadoop, Spark, H2O, sparkling-water, Pysparkling, SAS, etc. as well as deep learning tools such as Keras, Tensorflow, Theano, MXnet, PyTorch. GPU Cuda programming. Scaling data science. Expert in Predictive Modeling such as XGBoost, regression, Logit, Probit, GBM, RandomForest, Neural Network (generative model, GAN, VAE, RNN, CNN, word2vec etc.) , Naive Bays, K-nearest learn, PCA etc. (supervised learning, unsupervised learning, semi-supervised learning , reinforcement learning etc.) and also probabilistic modeling (PyMC3, Edward, Pyro) such as MCMC, HMC, NUTS, Bayesian linear regression, variational models etc, Data mining skills such as parsing, NLP (natural language processing) and proficient in language modeling such as topic model, text clustering, word embedding, Word2Vec, Glove, text classification, RNN, Convolutional RNN etc. familiar with all the development environment such as Hadoop, Cloud (AWS, GCP, Azure) , GPU, Spark. Docker etc.