Visual Properties of SARS-CoV-2 Sequences

This is a selection of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences converted into 2D images with some distinct visual properties, particularly those that are on the higher organization side. The algorithm for conversion of linear RNA/DNA strands into 2D images introduced here is based on two kinds of structures. One is a linear structure of 5 discrete values from the gray scale with 25% difference in property “black” between neighbors: Black (B) = 100%, Dark (D) = 75%,  Gray (Gr) = 50%, Light (L) = 25% and White ( W ) = 0%. Here values D, Gr and L have two neighbors each, while B and W have only one. To these five values we will assign RNA/DNA bases in this way: U = Black, C = Dark, A = Gray, G=Light and T = White

01 dnk

Fig.1

 Interestingly, all the base pairs in this representation are defined with only one rule:  50% value difference. (F ig1) The second structure, which is the foundation for establishing a 2D image, is a structure of positions in the form of a 3×4 matrix. Here      the number 3 corresponds to the number of bases in the codon, while 4 is the number   of bases in the RNA or DNA. In this structure we could place 12 bases/values at the time.

It is worth noting that the elements in this 2D structure have different neighborhood relationships. Each of the four corner-elements has 2 neighbors, two elements in the middle have four, while the other six elements have three neighbors each. (Fig.2)

2

Fig.2

There are different ways to place a linear structure into this 3×4 matrix, which has been presented in another paper. Here we will adopt one shown in Fig.3. When a set of values is placed within a structure of positions then, in addition to their neighborhoods based on values, they will enter into spatial neighborhoods (neighborhoods of positions) as well.

06 (2)

Fig.3

In this case the spatial neighborhoods between the elements containing the same value we will name Connections (Cn) while the neighborhoods between different values will be Junctions (Jn). While there is no value difference between neighboring elements in the case of Connections, there are three possible value differences in the case of Junctions: Jn1 = 25% (T-G, G-A, A-C (or C-U), Jn2 = 50% (T-A, G-C (or A-U)), and Jn3=75%(T-C (or G-U)).

4

For example, the beginning of this corona2 genome strand organized in 2D images is shown in Fig.4. The very first image (4a) contains five T, two G and five A. There are two connections between T, one between G, three between A, and six junctions (Jn2) between T-A, two Jn1 between G-A and two Jn1 between T-G. Another image in the same line (4b), contains 6T, 2G and 2C, with one G-G and three A-A connections.  The number of Jn1 junctions T-G is 3 and A-C is 2, while T-A junctions Jn2 = 5, and T-C Jn3 = 3.

5.

Interesting cases are those with the highest number of connections like those shown in Fig.5, which are states of low entropy, while on the other hand, the cases with no connections at all could be considered as states of the highest entropy (Fig.6).

6

In the low entropy case 5a, there are 11 connections (6 C-C and 5 A-A) and 6 A-C junctions Jn1, while in cases 5b and 5c there are 10 connections (5 T-T and 5 A-A) and 7 T-A junctions Jn2.

On the other hand, in the presented cases of high entropy there are no connection (Cn = 0). In the case 6a there are 17 junctions:  Jn1 = 4 (A-C), Jn2 = 11, (T-A) and Jn3 = 2 (T-C) while in 6c there are also 17 junctions: Jn1 = 11 (4 T-A, 2 G-A, 5 A-C), Jn2 = 1 (G-C), and    Jn3 = 5 (T-C).

Clearly, in the low entropy states the number of junctions is lower;   while in high entropy states this number is higher (it is reverse for connections). It might be interesting to notice that entropy in thermodynamics is defined between two values (hot and cold) while here, looking into the DNA/RNA strands, entropy has to be defined within the states with 2, 3 and 4 different values. It seems that on the micro-level it is the number of junctions that is the key indicator for the degree of the organization of a certain state, but it could also bring up some interesting questions.

7

For example, three-value states (Fig.7a and 7b) closely resemble the low entropy case 5c, but this state consists of only two values with equal numbers (6T and 6A), its number of connections (Cn = 10) is higher than the junctions (Jn2 = 7). However, while in the three-value state 7a, the number of connections (Cn = 9) is still higher than 8 junctions:                Jn1 = 1 (A-C), Jn2 = 3 (T-A) and Jn = 4 (T-C), in the other similar  three-value state 7b,      the number of connections (Cn = 8) is lower than the number of junctions, which is 9         (Jn1 = 2, Jn2 = 4, Jn3 = 3).

There are also interesting cases such as two-value states (Fig.8) that are not easy to classify, like these two with 10 A and 2 C. In both cases the number of junctions (Jn1) is 4 (8a) and 6 (8b), thus rendering them as cases of low entropy.

8-9

The next states, defined by the ending stretch of this corona2 genome (p.29866) are even more difficult to define (Fig.9). The state 9a consists of 11 A an only one C, having 14 connections and 3 junctions (Jn1), and in the next state (9b) the position is such that it has only two junctions. This seems to be the lowest number of junctions possible, but we have here a 11:1 ratio between two values. Just to compare, with an equal number of values 6:6 there are states that have only 3 junctions (not found in this genome).  Finally, all further states defined by this ending stretch consist only of value A(9c). Here only connections (Cn = 17) exist, there are no junctions (Jn = 0). According to the numbers these should be states with the lowest entropy, but the question of course is can this notion be defined within a state containing only on value. In a way, although having no junctions at all, states like these are closer to be defined as high entropy states, since within these states there are no sections with different properties. In other words, any position within such state is surrounded by the same values, which resembles a warm thermodynamic room in the state of entropy. It seems that within these discrete states with a small number of positions and four values, the notions of high organization and entropy will most likely have to be redefined. These are only some of the themes that could be defined within this kind of visual representation of DNA and RNA. For example, it might be interesting to identify the unary operators that would generate this particular genome, or to compare the patterns and distribution of light and dark within the representation of some of its longer sections.

Al these images might help us to identify and interpret certain genetic properties of this particular virus, but at the same time they themselves are, in a way, glimpses of the “world” the way it is perceived (interpreted) by this observer.

 

Gregor Mobius                                                                                                      March-April 2020

Source:

https://www.ncbi.nlm.nih.gov/nuccore/MN908947.3?report=graph

By changing  the base T with U (or white with black) we could get the RNA expression of this virus.

1

02

03

04

05

06

07

08

09

10.

12s.

13s

14s

15s

16s

17s

18s

19s

20s

21s

22s

23s

24s

25s

26s

27s

28s

29s

30s

31s

32s

33s

34s

35s

36s

37s

38s

39s

40s

41s

42s

43s

44s

45s

46s

47s

48s

49s

50s

51s

52s

53s

54s

55s

56s

57s

58s

59s

60s

61s

62s

63s

64s

65s

66s

67s

68s

69s

70s

 

 

* * * * * * * *

Appendix

Four of these sequences are  here organized in larger 9×16 structures  according to the algorithm presented below.205

251 (2)

2 (6)

1111

**********

2 (3)

2222

**********

3 (2)

3333

**********

3 (3)

4444

***********

71s.

71s.1

***********

71s

71s1

**********

72s.

72s.1

***********

72s

72s1

**********

73s.

73s.1

**********

73s

73s1

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s