Hi,

I'm trying to parse 2 dates from a text string. My starting point is a horrific multi-tab excel file whose tab names are like this:

Balance sheet 2015_3_4
Balance sheet 2015_11_12
Balance sheet 2016_9

Why does this work?

(\D+)(\d{4})(_)(\d+)(_)?(\d+)?

Specifically, Why don't I need "?" after each "\d+" i.e. why isn't the regex this:

(\D+)(\d{4})(_)(\d+?)(_)?(\d+?)?

I think what I'm struggling to comprehend here is the need to specify "laziness".

The gist of the source sheet is that there are 2 "amount" columns on each tab. My next level problems will be to construct a pair of full date columns and somehow stitch these together to this pair of "amount" columns and in turn convert that into 1 x date and 1x amount columns..... so if you can interpret that and suggest any tips that would also be handy!


Thanks,


Andy