BigPetStore Product Generator
BigPetStore currently reads products from a JSON file. This approach was useful when in the initial development and validation work. The number of customer purchasing patterns BigPetStore can manifest depends on the number of products. Manually creating products is time intensive, however and limits what we can do with BigPetStore. To overcome this limitation, I came up with a strategy, which I’ll describe here, for enumerating products from a specification. I implemented the strategy discussed here and submitted it as a patch.
Product Category Basics
Products always belong to mutually-exclusive categories. Products within the same category are interchangeable. Thus, the API for creating product descriptions centers around a ProductCategoryBuilder
. Product categories include details such as the applicable species (so only owners with those species buy those particular products), the number of times a product is used per day, the average and variance of the amount of the product used per pet, parameters for determining when a product triggers a transaction and when a product is purchased within a transaction.
ProductCategoryBuilder builder = new ProductCategoryBuilder();
builder.addApplicableSpecies(PetSpecies.DOG);
builder.setCategory("dry dog food");
builder.setTriggerTransaction(true);
builder.setDailyUsageRate(2.0);
builder.setAmountUsedPetPetAverage(0.25);
builder.setAmountUsedPetPetVariance(0.1);
builder.setTriggerTransactionRate(2.0);
builder.setTriggerPurchaseRate(7.0);
Specifying Product Fields
The products themselves are generated by enumerating the Cartesian product of specified field values. With each field value, a sum term and a product term must be specified. The price for a given product is determined by first adding all the sum terms from its associated field values to the base price then multiplying by the product terms. The quantity
field is required as it is used for determining when a user will use up a product in addition to modifying the price to account for size of the product.
builder.setBasePrice(2.0);
builder.addPropertyValues("brand",
new ProductFieldValue("Wellfed", 0.0, 1.0),
new ProductFieldValue("Happy Pup", 0.67, 1.0),
new ProductFieldValue("Dog Days", 1.0, 1.0));
builder.addPropertyValues("flavor",
new ProductFieldValue("Chicken", 0.0, 1.0),
new ProductFieldValue("Pork", 0.0, 1.0),
new ProductFieldValue("Lamb & Rice", 0.0, 1.0),
new ProductFieldValue("Fish & Potato", 0.0, 1.0));
builder.addPropertyValues("organic",
new ProductFieldValue("true", 0.0, 1.25),
new ProductFieldValue("false", 0.0, 1.0));
builder.addPropertyValues("quantity",
new ProductFieldValue(4.5, 0.0, 4.5),
new ProductFieldValue(15.0, 0.0, 15.0),
new ProductFieldValue(30.0, 0.0, 30.0));
Filtering Products
Not all combinations of field values may yield valid products, however. For example, a value brand of dog food may not use premium or organic ingredients. The product generator module provides a way of implementing exclusions through a Boolean algebra DSL. FieldPredicate
is used for matching on field values while AndRule
, OrRule
, and NotRule
can be used to produce combinations of values to exclude.
builder.addExclusionRule(new AndRule(
new FieldPredicate("brand", "Happy Pup"),
new FieldPredicate("organic", "true"));
builder.addExclusionRule(new AndRule(
new FieldPredicate("flavor", "Pork"),
new FieldPredicate("brand", "Dog Days"));
Lastly, once the products have been specified, the build()
method can be used to generate a ProductCategory
object containing all the products.
ProductCategory dogFood = builder.build();
Future Work
Being able to generate products opens new opportunities for BigPetStore and helps us realize our goals of generating pattern-rich data. However, this also introduces a few challenges. First, parameterizing Markov Models is computationally expensive, which shows when we increase from 10 to 1200 products in a category. Anticipating that this would be a problem, I implemented the ability to generate purchasing profiles separately from customers so that purchasing profiles can be re-used. Nonetheless, alternative and more efficient approaches to model generation and evaluation are warranted.
We also want to enable users to choose the size of the product space. I’m considering two approaches for this. I want to look into enabling dynamics plugins by embedding a scripting language such as Clojure or Groovy. This will allow users to choose between pre-supplied product specifications or extend the product space with their own specifications. Plugins also have the advantage of keeping configuration information separate from compiled code. The second approach would be to enable a sub-sampling approach which will allow the user to control how many of the products are generated. It may be possible to combine the approaches.