ADS Capstone Chronicles Revised
14
4.2.1 Building Database. The cleaned retail data is organized into four distinct tables to enhance clarity and usability. The customer_info table contains comprehensive details about each customer, including age, gender, income, city, and contact information. This table is crucial for understanding the customer base and enables targeted and personalized marketing strategies. The transactions_details table records all purchase activities, documenting transaction IDs, customer IDs, dates, times, amounts spent, and the number of items bought. This table is instrumental in analyzing buying patterns and assessing customer value, which is vital for sales analysis and marketing strategy planning. Additionally, the product_info table encompasses information about the products, such as product IDs, categories, brands, and types. This table aids in identifying popular products, improving inventory management, and recognizing trends in customer preferences. The feedback table captures customer reviews and ratings linked to product and customer IDs, providing insights into customer opinions, pinpointing areas for improvement, and enhancing the overall customer experience. Incremental units have been generated for each row, establishing primary keys for table joins. This database structure facilitates advanced SQL queries, making the analysis more robust, efficient, and comprehensive. 4.3 Feature Engineering Feature engineering is a crucial step in the data preparation process, transforming raw data into meaningful features that enhance the performance of machine learning models. In this e-commerce project, several specific feature engineering techniques are employed to enrich the dataset. Key features were derived from transactional data, including customer lifetime value (CLV), average transaction frequency, and recency of the last purchase, providing deeper insights into customer behavior. Additionally, the created features such as the total number of items bought, average amount spent per transaction,
and product category affinities to capture purchasing patterns. Customer segmentation based on Regency, Frequency, and Monetary (RFM) analysis was conducted to classify customers into segments like Loyal Customers, At-Risk Customers, and Big Spenders. Further, time-based features such as the time of the day and day of the week for transactions were generated to identify peak shopping periods. By leveraging data processing tools and ensuring data integrity through rigorous cleansing and normalization, these engineering features significantly improved the predictive accuracy and robustness of the machine learning models. A meticulous feature engineering process is pivotal in transitioning from raw data to actionable insights, thereby enabling more effective decision-making and resource allocation in the competitive e-commerce landscape. 4.4 Modeling Two model techniques are used for different customers. One model will separate different types of customer according to their buying frequency, recency and monetary. It will help companies to apply different marketing campaigns. Other models will help companies to recommend those separated customers. In the highly competitive e-commerce landscape, organizations recognize customers spend a limited amount of time on their websites, making an intuitive and efficient user interface crucial. For the company under review, the product catalog spans five major categories—books, clothing, electronics, grocery, and home decor—encompassing a total of 318 unique products. These products are further divided into 33 distinct types, highlighting the need for effective product recommendations to maximize customer engagement and satisfaction. By strategically displaying the top five suggested products to consumers, the company can enhance the shopping experience, drive sales, and build customer loyalty.
58
Made with FlippingBook - Online Brochure Maker