Sounds like you're planning a "markov" method similar to what I used for the world map generator, Madsiur. Assuming you have the layer 1 already generated, layer 2 should be a lot easier to work off of because it isn't as complex. You can make the program automate its own prep work by having it go through every map that uses the same tileset and catalogue which tiles can be adjacent to each other and still be valid. Ideally you would keep a running total of every instance of two adjacent tiles so you can generate layer 2 tiles that fit the probability distribution. I suppose it is possible to have a situation where a configuration of L1 tiles is legal but there is no possible L2 configuration that matches them, but I expect that this would be pretty rare so the failure strategy could be something like "just don't use L2 tiles".
For the world map, I had two basic validators, a "north-south" and an "west-east" validator, that were used to make sure each pair was legal. For a chosen tile called "center" these were applied once each ("north-center" and "west-center") since the tiles were generated starting from the northwest corner. However, because the map had high complexity, I needed to optimize the program to minimize backtracking and finish in a reasonable amount of time. So I also used the validators for a 1-step lookahead ("center-south" and "center-east"), and a 2-step lookahead based on those ("east-southeast" and "south-southeast"). Actually I had more complex lookaheads than that, but you get the general idea.
For choosing L2 tiles, I expect that you would still need the same N-S and W-E validators, but also maybe a L2North-L1South validator or something because L2 tiles are often use to create the tops of trees, etc. Since L2 isn't so complex, I don't think it needs optimization, lookaheads, or backtracking, but there probably will be a few impossible situations, which is where the "just don't use L2 tiles" strategy comes in.