# Molecule to atoms

Here is a interested algorithm question on codewars site.

For a given chemical formula represented by a string, count the number of atoms of each element contained in the molecule and return an object.

For example:

water = ‘H2O’

parse_molecule(water) #return {H:2,O:1}

magnesium_hydroxide=’Mg(OH)2′

parse_molecule(magnesium_hydroxide) #return {Mg:1,O:2,H:2}

fremy_salt=’K4[ON(SO3)2]2′

parse_molecule(fremy_salt) #return {K:4,O:14,N:2,S:4}

As you can see, some formulas have brackets in them, The index outside the brackets tells you that you have to multiply count of each atom inside the bracket on this index. For example, in Fe(NO3)2 you have one iron atom, two nitrogen atoms and six oxygen atoms.

Note that brackets may be round,square or curly and can also be nested. index after the braces is optional.

In this question, the molecule can be very complex, such like this: K4[ON(SO3)2]2{Na7H9[Fe7Mg2(P3S)2]}4, just an example, maybe there is no such chemical in real world.

To solve this problem, one way is how to make the molecule presentation simple and simple step by step, for example:

We can use a recursion algorithm to remove bracket until no more brackets in the molecule and finally count the number of atoms.

First, to split the string into groups and make sure certain groups include bracket(without nested ) and index, this can be done by regular express:

#### (.*)(\[.*\]|\(.*\)|\{.*\})(\d+)(.*)

use this regular expression, the molecule in above picture can be split into four groups:

group1: K4[ON(SO3)2]2{Na7H9[Fe7Mg2

group2:  (P3S)

group3: 2

group4: ]}4

Then we merge group 2 and group 3 into one: (P6S2), and remove the bracket, it will be P6S2 as a new group, merge group1, new group and group4 into new string:

### K4[ON(SO3)2]2{Na7H9[Fe7Mg2P6S2]}4

repeat above processing until there is no more bracket, the final string will be:

### K4[ON(SO3)2]2{Na7H9[Fe7Mg2P6S2]}4

for merging group2 and group3, we can use a individual function:

Define a main function to do the group splitting and final atom counting:

The following two tabs change content below.

#### Wang Weiqiang (王维强)

Wang Weiqiang is a senior web developer, and professional on ASP.NET, MVC, C#, Python, SQL Server, HTML5, Javascript, also interesting in machine learning and related algorithm.

#### Latest posts by Wang Weiqiang (王维强) (see all)

Wang Weiqiang is a senior web developer, and professional on ASP.NET, MVC, C#, Python, SQL Server, HTML5, Javascript, also interesting in machine learning and related algorithm.

Posted in Algorithm Tagged with: ,