There are whole lot of, what I think are, really helpful comments have come into this thread since my last post.
But on the specific question you asked and I responded to, I would confirm my earlier view. There's nothing wrong with the basic capture (but do study what the others have said). The vibrancy or 'pop' that you refer to is brought out through post-processing.
You are absolutely on the right road in shooting in RAW. The frustrating thing about that in the beginning is the feeling that you don't have all the post-processing knowledge and skills to get the best out of the RAW data. But that it part of the learning curve and will come. Stick with it.
You write about your experiences of sharpening. On a connected theme, and when you have time, go into the tutorial,
here, on Local Contrast Enhancement (LCE). Still one of the best methods of getting that 'pop' into your images.
I hope you don't mind, but just to illustrate what I mean. Here is your image and then a copy of it I made with LCE added. Is this the sort of thing you were thinking about?