Sum of largest two eigenvalues of a simple graph.3

July 2, 2008

The following graphs give counter-examples to Gernert’s conjecture \lambda_1+\lambda_2 \leq n for graphs \Gamma with n vertices, for n = 34, 36, 37 though the examples do not have a connected complementary graph \Gamma^{\circ}.

Together with Nikiforov’s 2006 paper:

Nikiforov, V. (2006) Linear combinations of graph eigenvalues. Electronic Journal of Linear Algebra, 15, 329-336. (nikiforov-sums_of_graph_eigenvalues)

and the recent (June 13, 2008 ) paper of Javad Ebrahimi, Bojan Mohar, Vladimir Nikiforov and Azhvan Sheikh Ahmady (eigenvalue_sum_extended), it’s beginning to look more likely that for all n \geq 21 there is a graph \Gamma on n vertices with \lambda_1 + \lambda > n. As Bojan Mohar remarks, following the construction in his recent paper, for n \equiv 0 \textrm{ (mod 7)} and sufficiently large, there is such a \Gamma with \Gamma^{\circ} conected.

*************

The graph with 37 vertices and 657 edges, and adjacency matrix that has zeros in the following places:

(1, 1), (2, 2), (2, 14), (2, 17), (2, 19), (2, 26), (2, 34), (2, 37), (3, 3), (4, 4), (5, 5), (6, 6), (6, 8), (7, 7), (8, 6), (8, 8), (9, 9), (9, 34), (10, 10), (11, 11), (12, 12), (12, 14), (13, 13), (14, 2), (14, 12), (14, 14), (15, 15), (16, 16), (17, 2), (17, 17), (18, 18), (19, 2), (19, 19), (20, 20), (21, 21), (22, 22), (23, 23), (24, 24), (25, 25), (26, 2), (26, 26), (27, 27), (28, 28), (29, 29), (30, 30), (31, 31), (32, 32), (33, 33), (34, 2), (34, 9), (34, 34), (35, 35), (36, 36), (37, 2), (37, 37)

has sum of the two largest eigenvalues \lambda_1 +\lambda_2 \approx 37.0176 > 37.

It has characteristic polynomial -x^2(1+x)^28(2+x)(128+642x+344x^2-226x^3-191x^4-30x^5+x^6)

numerical spectrum (-3.08297, -2., -1.92919, -1.77378, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.231694, -2.13825*10^-17, 7.84933*10^-17, 1.47326, 35.5444),

degree sequence (36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 35, 35, 35, 35, 35, 35, 35, 35, 34, 34, 30)

and energy \approx 74.0353.

The complement is disconnected:

*************

The graph with 36 vertices and 622 edges, and adjacency matrix that has zeros in the following places:

(1,1), (2,2), (3,3), (3,16), (4,4), (5,5), (6,6), (7,7), (8,8), (9,9), (9,16), (9,33), (10,10), (11,11), (12,12), (13,13), (14,14), (15,15), (16,3), (16,9), (16,16), (16,19), (16,24), (16,35), (17,17), (18,18), (19,16), (19,19), (20,20), (21,21), (22,22), (23,23), (23,36), (24,16), (24,24), (24,33), (25,25), (26,26), (27,27), (28,28), (29,29), (30,30), (31,31), (32,32), (33,9), (33,24), (33,33), (34,34), (35,16), (35,35), (36,23), (36,36)

has sum of the two largest eigenvalues \lambda_1 +\lambda_2 \approx 36.0019 > 36.

It has characteristic polynomial x(1+x)^29(2+x)(4+290x+31x^2-125x^3-31x^4+x^5)

numerical spectrum (-3.06372,-2.,-1.92434,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,-1.,1.42018,34.5817),

degree sequence (35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 34, 34, 34, 34, 34, 33, 33, 33, 30)

and energy \approx 57.98994.

The complement is disconnected:

*************

The graph with 34 vertices and 553 edges, and adjacency matrix that has zeros in the following places:

(1,1), (2,2), (3,3), (3,6), (4,4), (5,5), (6,3), (6,6), (6,11), (6,27), (7,7), (8,8), (9,9), (10,10), (10,27), (11,6), (11,11), (11,25), (12,12), (13,13), (14,14), (15,15), (16,16), (17,17), (18,18), (19,19), (20,20), (21,21), (22,22), (23,23), (24,24), (25,11), (25,25), (25,27), (26,26), (27,6), (27,10), (27,25), (27,27), (27,29), (27,30), (28,28), (29,27), (29,29), (30,27), (30,30), (31,31), (32,32), (33,33), (34,34)

has sum of the two largest eigenvalues \lambda_1 +\lambda_2 \approx 34.063309767691338026 > 34.

It has characteristic polynomial (1+x)^27(-21+133x+622x^2+328x^3-209x^4-175x^5-27x^6+x^7)

numerical spectrum (-3.03079, -2.13762, -1.60305, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.396219, 0.10436, 1.5018, 32.5615),

degree sequence (33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 32, 32, 32, 32, 31, 31, 30, 28 )

and energy \approx 68.3353.

The complement is disconnected:

Adjacency matrices of graphs, binary trees and the Farey tree

Viewing graph energy as a function from rational numbers to algebraic integers

The adjacency matrix A(\Gamma) of a graph \Gamma is a symmetric matrix with 0’s on the main diagonal and 0’s or 1’s elsewhere. A(\Gamma), and therefore \Gamma is uniquely determined by the upper triangular entries a_{i,j} with j > i. For a graph with n vertices there are \frac{1}{2}n(n-1) such entries and we can arrange them as a single vector of 0’s and 1’s of length \frac{1}{2}n(n-1): (a_{1,2}, a_{1,3}, a_{1,4}, \ldots,a_{1,n}, a_{2,3}, a_{2,4},\ldots, a_{2,n},a_{3,4}, \ldots, a_{3,n}, \ldots, a_{n-1,n}).

This vector of 0’s and 1’s uniquely determines the symmetric adjacency matrix and so uniquely determines the graph \Gamma.

We can view a vector of 0’s and 1’s as determining a unique position in the infinite binary tree. We navigate the infinite binary tree by making a left move from any node if we are at a 0 in the sequence of 0’s and 1’s (starting from the left) and by making a right move if we are at a 1 in the sequence of 0’s and 1’s.

For example, the graph:

with adjaceny matrix \left(\begin{array}{c c c c } 0 & 1 & 0& 1\\ \ 1 & 0 & 1& 0 \\ \ 0 & 1 & 0 & 1 \\ \ 1 & 0 & 1& 0\end{array}\right) , gives the sequence (1,0,1,1,0,1) which corresponds to the bottom node shown below on the infinite binary tree:

We can conveniently, think of this node as representing the dyadic number 0.10111= \frac{23}{32}. Because the sequences of 0’s and 1’s we obtain from the upper triangular portion of the adjacency matrix of a graph are finite, we obtain in this way, dyadic fractions – that is fractions between 0 and 1 whose numerators are a power of 2. We can view the tree of dyadic rationals as follows:

We do not obtain all dyadic fractions in this way, without some reflection. For example, the dyadic fraction \frac{11}{16} is the dyadic number 0.1011 which corresponds in the binary tree to the binary sequence (1,0,1,1). Because this sequence is of length 4, which is not a triangular number – that is, of the form \frac{1}{2}n(n-1) – it does not come from the adjacency matrix of a simple graph. However, we can rectify this situation by appending as many 0’s to the sequence (1,0,1,1) as needed until we do obtain a sequence whose length is a triangular number. For example, the smallest triangular number greater than or equal to 4 is 6, so we add two 0’s to the sequence (1,0,1,1) to get (1,0,1,1,0,0). This sequence corresponds uniquely to the matrix \left(\begin{array}{c c c c } 0 & 1 & 0& 1\\ \ 1 & 0 & 1& 0 \\ \ 0 & 1 & 0 & 0 \\ \ 1 & 0 & 0 & 0\end{array}\right) , and so to the graph:

If we add as many 0’s as we require to get to the smallest triangular number greater than or equal to the length of the dyadic number corresponding to a dyadic rational , then we will have a well-defined procedure for mapping any dyadic rational, in the interval (0,1), back into a unique sequence of 0’s and 1’s whose length is a triangular number. In this way we obtain a one-one correspondence between simple graphs and dyadic fractions greater than 0 and less than 1.

Now we can match rationals between 0 and 1 with dyadic rationals by using the Farey tree and Minkowski’s question mark function. The Farey construction of the rational numbers consists of the fractions between 0 and 1 generated recursively as follows: the initial sequence of rationals is (\frac{0}{1},\frac{1}{1}); At each step a new sequence is constructed by forming the mediant of neighboring fractions in the previous sequence. The mediant of two fractions \frac{p}{q}, \frac{r}{s} is \frac{p+r}{q+s}. So the next sequence is (\frac{0}{1}, \frac{1}{2},\frac{1}{1}) and the following sequence is (\frac{0}{1}, \frac{1}{3},\frac{1}{2}, \frac{2}{3},\frac{1}{1}). Following this construction, the entire set of rational numbers between 0 and 1 can be arranged in a tree with two initial nodes 0 and 1, with each fraction placed in the tree immediately below its lowest mediant progenitor:

Utilizing the identical stucture of the Farey tree and the binary tree, Minkowski’s question mark function, denoted ?, maps the fractions between 0 and 1, inclusive, onto the dyadic rationals. An inductive definition of ? goes as follows: ?(0) = 0, ?(1) = 1 and ?(\frac{p+r}{q+s}) =?(\frac{p}{q})+?(\frac{r}{s}), where \frac{p+r}{q+s} is the mediant of \frac{p}{q} and \frac{r}{s}.

Minkowski’s question mark function is a a continuous and monotonically increasing function from the set of rational numbers between 0 and 1 to the set of dyadic rationals between 0 and 1. It’s graph looks as follows:



So now we can uniquley match any rational number strictly between 0 and 1 with a dyadic rational number strictly between 0 and 1; we can match the dyadic rational with a unique sequence of 0’s and 1’s whose length is a triangular number; we can match each such sequence with a symmetric matrix of 0’s and 1’s with 0’s on the main diagonal; we can match such matrices with simple graphs; and vice versa. So by this procedure we have matched each simple graph with one and only one rational number between 0 and 1 and each rational number with one and only one graph.

As a result, any function defined for all graphs can be regarded as a function defined on the rational numbers. In particular, we can view the energy of a graph as a function defined on the rational numbers in the interval (0,1) and taking values in the ring of algebraic integers.

Isomorphic graphs, which give row-permuted adjacency matrices, will generally give different dyadic rationals and so different rational numbers, under this correspondence.

For computational purposes the recursive definition of the Minkowski question mark function can be replaced by the alternative definition that comes from the finite continued fraction for a rational number in the interval \lbrack 0,1 \rbrack: if the rational number x has the finite continued fraction \lbrack a_1,a_2,\ldots,a_n\rbrack then ?(x) = 2\sum_{k=1}^n(-1)^{k-1} 2^{-(a_1+\ldots+a_n)} (Reference: minkowski_question_mark_function – although notice that Linas Vepstas gets the power of -1 wrong: it should be (-1)^{k-1} and not (-1)^k).

For example, \frac{7}{9}=\lbrack1,3,2\rbrack so?(\frac{7}{9})= 2(\frac{1}{2}^1- \frac{1}{2}^{1+3} +\frac{1}{2}^{1+3+2}) =\frac{29}{32} =0.11101_2. The base 2 string of digits of ?(\frac{7}{9}) has length 5, which is not a triangular number, so we add an extra 0 to get the binary string 111010. This corresponds to the matrix \left(\begin{array}{c c c c } 0 & 1 & 1 & 1\\ \ 1 & 0 & 0 & 1 \\ \ 1 & 0 & 0 & 0 \\ \ 1 & 1 & 0 & 0\end{array}\right) and so to the graph:

with energy \approx 4.96239.

Here is a plot of the energy for the dyadic rationals \{\frac{a}{2^{10}}\vert a = 1,\ldots,2^{10}-1\}:

Sum of largest two eigenvalues of a simple graph.2

Below are several other examples of non-isomorphic graphs with 40 vertices, 770 edges, and sum of the two largest eigenvalues > 40. None of these has a connected complement.

1. The graph \Gamma_2(40,770) with v=40 vertices and e = 770 edges, and adjacency matrix:

has \lambda_1+\lambda_2 \approx 40.0086 > v=40.

The characteristic polynomial is x(1+x)^31 (-68-393x+1131x^2 +1681x^3 +101x^4-622x^5 -274x^6-31x^7+x^8) and the sorted numerical spectrum is (-3.10756, -2.32462, -1.93923, -1.85196, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.132379, -4.09544*10^-16, 0.347119, 1.48047, 38.5282).

The graph has (sorted) degree sequence (39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 38, 38, 38, 38, 38, 38, 38, 38, 37, 37, 37, 33).

The energy is E(\Gamma_2(40,770)) \approx 80.7115.

The graph \Gamma_2(40,770) is, therefore, not isomorphic with \Gamma_1(40,772)

2. The graph\Gamma_3(40,770) with v=40 vertices and e=770 edges with the following adjacency matrix:

has \lambda_1+\lambda_2 \approx 40.0284 > 40.

The characteristic polynomial is x^2(1+x)^29(2+x)(-50+271x +1577x^2+1506x^3 -74x^4 -649x^5-273x^6-31x^7+x^8) and the sorted numerical spectrum is (-3.13442, -2.13787, -2., -1.86594, -1.60276, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.396136, -8.94552*10^-17, 1.54841*10^-16, 0.108693, 1.50412, 38.5243).

The graph has (sorted) degree sequence (39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 38, 38, 38, 38, 38, 38, 38, 38, 37, 37, 37, 33).

The energy is E(\Gamma_3(40,770)) \approx 80.2743.

The graph \Gamma_3(40,770) is, therefore, not isomorphic with \Gamma_1(40,772) or with \Gamma_2(40,772).

3. The graph\Gamma_4(40,770) with v=40 vertices and e=770 edges with the following adjacency matrix:

has \lambda_1+\lambda_2 \approx 40.0473 > 40.

The characteristic polynomial is $latex(1+x)^31(33-137x-454x^2+1202x^3 +1783x^4+139x^5 -618x^6-274x^7-31x^8+x^9)$ and the sorted numerical spectrum is (-3.04901, -2.30734, -2.20329, -1.65743, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.346656, 0.199239, 0.317189, 1.52195, 38.5253).

The graph has (sorted) degree sequence (39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 38, 38, 38, 38, 38, 38, 38, 38, 37, 37, 37, 33).

The energy is E(\Gamma_3(40,770)) \approx 81.1274.

The graph \Gamma_4(40,770) is, therefore, not isomorphic with \Gamma_1(40,772), \Gamma_2(40,770) or \Gamma_3(40,772).

4. The graph\Gamma_5(40,770) with v=40 vertices and e=770 edges with the following adjacency matrix:

has \lambda_1+\lambda_2 \approx  40.0716> 40.

The characteristic polynomial is x(1+x)^31(-132-296x+1318x^2+1776x^3+ 122x^4- 620x^5-274x^6-31x^7+x^8) and the sorted numerical spectrum is (-3.12149, -2.35749, -1.92974, -1.75915, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.253681, 1.79662*10^-17, 0.349958, 1.54489, 38.5267).

The graph has (sorted) degree sequence (39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 38, 38, 38, 38, 38, 37, 36, 36, 36, 35).

The energy is E(\Gamma_3(40,770)) \approx 80.8431.

The graph \Gamma_5(40,770) is, therefore, not isomorphic with \Gamma_1(40,772), \Gamma_2(40,770), \Gamma_3(40,772) or \Gamma_4(40,772)

An example of a graph with v = 39, e = 731 and \lambda_1+\lambda_2 \approx 39.0523 > 39 is \Gamma_1(39,731) with the following adjacency matrix and 3-dimensional representation:

The characteristic polynomial is -x(1+x)^30(2+x)(-4-164x+674x^2+523x^3-196x^4 -202x^5-32x^6+x^7) and the sorted numerical spectrum is (-3.08933, -2.24518, -2., -1.92719, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.0223697, -4.2055*10^-16, 0.231765, 1.53858, 37.5137).

The graph has (sorted) degree sequence (38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38 ,38 ,38 ,38, 38, 38, 38, 38,38,38,38,38, 38, 38, 38, 38, 38, 37, 37, 37, 37, 37, 37, 37, 36, 35, 34, 34).

The energy is E(\Gamma_1(39,731) \approx 78.5681.

A 3 dimensional representation of the complementray graph is:

The vetcies apparently connected at the bottom left are in fact not connected, as the following, more prosaic, view shows:

I found this example by looking for a graph with 39 vertices, and taking a cue from \Gamma_1(40,770), looked for one with e \approx \frac{1}{2}39\times 38 \times \frac{770}{780} = 731.5.

Using the same strategy I found a graph \Gamma_1(38,694) with v= 38 vertices, e = 694 edges, and \lambda_1+\lambda_2 \approx 38.0326> 38. Apart from the diagonal the 0 entries in the adjacency matrix are in the following positions:

(5, 19), (8, 32), (9, 32), (14, 30), (14, 32), (16, 26), (19, 5), (19, 30), (19, 32), (19, 33), (26, 16),(30, 14), (30, 19),(32, 8), (32, 9), (32, 14), (32, 19), (33, 19).

The characteristic polynomial is x(1+x)^29(-1-3x+x^2+x^3)(48-297x- 484x^2-226x^3- 30x^4+x^5), the numerical spectrum (-3.06878, -2.17009, -1.92725, -1.68699, -1., -1., -1., -1., -1., -1., \-1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.311108, 2.86673*10^-16, 0.13162, 1.48119, 36.5514),the degree sequence (37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 36, 36, 36, 36, 36, 36, 35, 35, 33, 33) and the energy \approx 76.3284. Again, the complementary graph is disconnected:

Bojan Mohar just sent me a copy of his paper “On the sum of two largest eigenvalues of a symmetric matrix” (June 13, 2008 ) with Javad Ebrahimi, Vladimir Nikiforov and Azhvan Sheikh Ahmady.

Here’s the abstract:

D. Gernert conjectured that the sum of two largest eigenvalues of the adjacency matrix of any simple graph is at most the number of vertices of the graph. This can be proved, in particular, for all regular graphs. Gernert’s conjecture was recently disproved by one of the authors, who also provided a nontrivial upper bound for the sum of two largest eigenvalues. In this paper we improve the lower and upper bounds to near-optimal ones, and extend results from graphs to general non-negative matrices.

Here’s a pdf version of the paper: eigenvalue_sum_extended

Bojan Mohar comments (July 1, 2008 ) that Dieter Gernert’s modified conjecture cannot hold: “the proof towards the end of the paper shows that largest \lambda_1+ \lambda_2 is achieved when the complement is bipartite and (most likely) has several isolated vertices. The updated conjecture seems in a good direction but it is wrong again: Take our example (same paper) with 7n vertices and \tau_2 = \frac{8}{7} . Add one isolated vertex (or join it to the graph by one edge if you wish). This will make the complement connected but \tau_2 will not change much if n is large.”

Sum of largest two eigenvalues of a simple graph.1

The largest two, necessarily real, eigenvalues of a simple graph \Gamma play a significant role in the energy E(\Gamma) of \Gamma. The largest eigenvalue \lambda_1 is bounded below by \frac{2e}{v} where v is the number of vertices and E the number of edges of \Gamma. The largest eigenvalue features in Koolen & Moulton’s proof of the upper bound E(\Gamma) \leq \frac{2e}{v}+\sqrt{(v-1)(2e-(\frac{2e}{v})^2)}. Some knowledge of the second largest eigenvalue \lambda_2 might go some way toward sharpening this upper bound for E(\Gamma) in those – mostly common -cases in which this upper bound is not attained.

Dieter Gernert has proposed that the sum \lambda_1 + \lambda_2 of the largest two eigenvalues of a simple graph on v vertices should be bounded above by v. As he points out on the Spectral Graph Theory page, this is certainly true for regular graphs, and has also been proved for some special classes of graphs such as planar, toroidal, completely multipartite, and triangular-free graphs.

However, the graph \Gamma_1(40,770), with 40 vertices and 770 edges, whose adjacency matrix and – rather uninformative though beautiful – 3-dimensional representation appear below, has a sum of \approx 40.0032 > 40 for the sum of its two largest eigenvalues:


The characteristic polynomial is x(1+x)^{31}(-132-496x+1106x^2+1716x^3+120x^4-620x^5-274x^6-31x^7+x^8) and the sorted numerical spectrum is (-3.02512, -2.47643, -1.93174, -1.8057, -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -0.202583, -6.38638×10-17, 0.438356, 1.47645, 38.5268)

The graph has (sorted) degree sequence (39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 35, 34).

The energy is E(\Gamma_1(40,770)) \approx 80.8831 which is \approx 94.1\% of the Koolen-Moulten upper bound \frac{2e}{v}+\sqrt{(v-1)(2e-(\frac{2e}{v})^2)} for the energy of a graph with v=40 vertices and e=770 edges.

A statistical analysis of v-(\lambda_1+\lambda_2) for graphs with 13 \leq v \leq 60 vertices and v-1 \leq e \leq \frac{1}{2}v(v-1)-1 edges shows a very skewed distribution with a large mean \approx 9 and large standard deviation \approx 6:

Occasionally, but apparently rarely, despite the large standard deviation, v-(\lambda_1+\lambda_2) is negative, as it is for \Gamma_1(40,770).

Postscript: I just found this question has been answered. A definitive paper, on the existence of graphs with v< (\lambda_1+\lambda_2) is:

Nikiforov, V. (2006) Linear combinations of graph eigenvalues. Electronic Journal of Linear Algebra, 15, 329-336. (nikiforov-sums_of_graph_eigenvalues)

Nikiforov’s inequality max(\lambda_1+\lambda_2) >\frac{29 + \sqrt(329)}{42}v -25 for v \geq 21 shows that for all v \geq 205 there is a graph with \lambda_1+\lambda_2 > v .

Given the existence of \Gamma_1(40,770), what is the minimal v for which there is a graph on v vertices with \lambda_1 + \lambda_2 > v?

Given Nikiforov’s construction, Dieter Gernert has modified his question to: “Is is true that \lambda_1+\lambda_2 \leq v for all connected graphs \Gamma with v vertices and connected complementary graph \Gamma^{\circ}?” Dieter conjectures this is true.

Heavy tails of distributions of words in literary texts

My paper with Adam Callahan on the almost power-law behavior of the type-token ratio in literary texts has been submitted for publication and is available here:callahan_davis_heavy_tails_final

In this paper we look in some detail at the behavior of the average number of new words in the first n words of a text, as a function of n. Consider replacing each word in a text by a 1 if that word has not appeared previously in the text, and by a 0 if it has appeared previously. The type-token ratio \rho(n) is the average of the 1’s and 0’s up to n. The sequences of 1’s and 0’s we get this way are not independent, and is not equivalent to a sequence of Bernoulli trial. The behavior of \rho(n) as a function of n is that it is almost described by a power law. Basically, a regression of \log(\rho(n)) on log(n) gives a straight line with, typically, r^2\geq 0.95. This means there are constants A, d such that \rho(n) \approx \frac{A}{n^d}. When we plot \rho(n)n^d we should get, approximately, a constant. However we do not. Typically we get a plot that, as a function of n, increases relatively rapidly to a maximum and then decreases very slowly.

A real-valued function f of a positive integer variable is slowly varying if \frac{f(mn)}{f(m)} \to n^k as m \to \infty, for some number k. (This is a “soft analysis” definition in Terry Tao’s sense, I believe. In the applications we have in mind it would definitely help to make this a “hard analysis” definition).

Not only, is \rho(n) a slowly varying function, but \rho(n)n^d – with d as above – has a variance that is slowly varying and decaying to 0. We call functions that are slowly varying with a variance that decays to 0 as a power function, ultra-slowly varying. So what we find for literary texts of a variety of lengths, languages and genres, is that there is an index d (dependent on the text) such that \rho(n)n^d is ultra-slowly varying. If the ultra-slowly varying function were a constant we would have a genuine power law, but it isn’t and we don’t.

We find that for the literary texts we examined, there is a point in the text, which we call a turn-over point, beyond which the type-token ratio \rho(n) is very well described by a power law (typically, r^2 > 0.99) and before which, \rho(n) is better described as a function of the form \frac{\alpha+\beta log(n)}{n^d}.

In this paper we also consider the relative frequency of occurrence of a word in the first n words of a text as an estimate of the word’s probability of occurrence in the text, and consider the Shannon entropy H(n) of the first n words of text. Typically, H(n) increases logarithmically with n.

See also: Physics of Text.