This is a proposal to replace the existing DNA/RNA alphabet representation of the bases U C A G T with five discrete values on the gray scale Black, Dark, Gray, Light and White. These are percentage values in relation to one referential value named “black” and defined as “absence of light”. Thus value of Black (Uracil) is 100% “black”, Dark(Cytosine)=75%, Gray(Adenine)= 50%, Light (Guanine) =25% and White(Thymine) = 0%. The value difference between neighboring bases is 25%.
One interesting thing about this representation: if we establish all relationships between the bases that differ 50%, we will get only these three pairs: U-A, C-G and A-T. With this single rules we have not only established all the base pairs for DNA and RNA, but we could now see why only those and not some other pairs are possible. And the base pairs could be interpreted as additions: G-C(G+A=C), T-A(T+A=A), A-U(A+A=U) or subtractions: C-G(C-A=G) and A-T(A-A=T), and the “exchange” value in these transactions is always the same, it is Gray(50% black) that represents Adenine.
Further more, this visual representation could enable the analysis of larger structures, the patterns of distributions of black, gray, light,etc., various clusters that might be specific for certain gene sequences and even attach numerical values to them. For example, on the last attached image is the entire Genetic Code (left-RNA, right-DNA) in both visual and alphabet representations. Here on the left all 64 RNA codons are represented from UUU upper left to GGG lower right, and one could literally see the difference.
This visual representation might bring much better results and understanding in analyzing larger sequences than the existing one based on “arbitrary” letters without meaningful intrinsic relationships.