H.Jun Park
DATA SCIENCE PORTFOLIO
Harvard University Projects
PROFESSIONAL GRADUATE DATA SCIENCE COURSEWORK
Official Certificate
Lending Club Project
The main objective of this project is to provide a model to guide a potential investor in the selection of funding opportunities in the
P2P lending platform LendingClub. Using the machine learning ensemble(stacking) model and, deep learning model with more than millions of data from Lending Club(2007-present), US-based $16B amount of lending data.
Project Website: https://junparkh.github.io/LendingClub/
GitHub notebook(Python): 1. Wrangling, 2. EDA , 3. Modeling
Stock Prediction Model
This model is an individual project. I built the model to predict Korean Public Stock Price from NAVER STOCK data. Scrapping data from the website and wrangling the data. After standardization and cleaning, I applied the machine learning method to the deep learning model.
Github notebook(Python) : 1. Wrangling , 2. Modeling
Statistical Research
How MBA Degrees Affect On Salaries
The research is verifying MBA Degree by using the statistical method with R. Gathering data from the database of the Department of Economics at Lancaster University and apply the method such as t-test, chi-square test using R.
Document : Link
Probabilistic Programming and
Artificial Intelligence
Probabilistic programming has multiple uses in machine learning and artificial intelligence. I developed Reinforcement Learning methods to solve Warehouse Robotics cases.
GitHub notebook(Python) :
AlmondMedia
CEO & Founder of AI Art Startup in South Korea
- Generative AI art platform with 70K MAU in Japan, Korea, and Southeast Asia
- Raised $1M from VCs / 500Startup's 2021 KISED Batch
- Obtained 4 AI & tech patents in Korea
Service
AmongLive
Among.Live was the first global AI art community platform with 70,000 active paid users through a monthly subscription model powered by Stripe.
With our fine-tuned AI art models, trained on over 50 million community text entries, users can generate customized artwork. An AI-powered translation system ensures global accessibility, while a patent-pending TensorFlow CL-CNN filtering system allows users to personalize content by level and interest.
Key features include user-customizable GPT chatbots and AI-generated Art and Video.
Technical Stack: Built with JavaScript and Python for AI services, leveraging our proprietary fine-tuned Stable Diffusion and DALL-E 3 models
User Data KPI & Database System Architecture
Tech Patents
Patent 1
AI-Based CLASSIFYING CONTENT TYPE
- Analyze content types, such as text, images, and audio, through AI filtering to assess the content’s attributes (e.g., violence, sexuality, copyright issues).
- Classify text, images, and videos using a CL-CNN-based model.
- Use content log data for recommendation modeling.
- Use user data to classify abusive or fake users.
Patent 2
Emotion Analysis-Based Automated Translation
- Classify various elements within content and apply translation appropriate to each element.
- For multilingual analysis of images (such as speech bubbles) and text, separate each text component and conduct emotion analysis to execute contextually accurate translations, minimizing machine translation errors.
Patent 3
APPARATUS FOR MEASURING A CONTRIBUTION OF CONTENT AND METHOD THEREOF
Patent 4
SYSTEM FOR PROVIDING NON-FUNGIBLE TOKEN BASED GAME ASSET PLATFORM SERVICE
Side project : Today's Plant
Founded and managed the 'Today's Plant' e-commerce brand with a local bonsai farmer over two years, doubling revenue to $400K.
Achieved 500% Return on Advertising Spend (ROAS), 15% monthly growth, and a 28% profit margin through AI-driven, ads and SEO-optimized product selection and customer journey analysis.
Oversaw all aspects of the business, including setting up customer service, marketing, and operations.
Data: Google ads position & SEO