I need an API created that will take a metadata definition and input photos and create a single photo output that looks as natural as possible. See the attached images for the rough idea.
Each class of recipe (e.g., Pizza) will have its own structure, such as:
Dough Base - white, wheat, green
Sauce - marinara, pesto, white
Cheese - mozzarella
Toppings[up to 3] - onion, mushroom, etc.
Then it would be given the ordering and images (many images would have quite a bit of transparency) and whatever other reasonable metadata that could be provided, and then create the output.
There will be thousands of different types of food (not just pizza!).
I have done experiments with straight compositing. It doesn't look horrible at low res, but the lack of shadows made everything look artificial. Since this is food, it should look good and as photorealistic as possible. In other words, it has to be smarter than simple compositing.
I don't quite know how to write the requirements for this project, so I'm open to clarifying questions.