Kevin P. Costello's Research Interests

Research

My current interests lie in the union (or better yet, the intersection) of combinatorics, probability, linear algebra, and algorithms.

Publications

Random symmetric matrices are almost surely non-singular Duke Math. J. 135 (2006), 395-413 (with T. Tao and V. Vu)

Consider a symmetric matrix whose entries on or above the main diagonal are equally likely to be 1 or -1. By adapting an argument of Komlós, we show that the probability that this matrix is singular is O(n^-⅛+ε) for any positive ε. The key difficulty arises due to the fact that the determinant of a symmetric matrix is a quadratic function in the entries (as opposed to a linear one) due to the symmetry condition. To overcome this, we develop a quadratic extension of the Littlewood-Offord inequality to show that this form is unlikely to be 0. (An improved version of this lemma is one subject of the "Bilinear and quadratic variants..." paper below).

The rank of random graphs. Random Structures and Algorithms 33 (2008), 269-285 (with V. Vu)

One corollary of the results in the previous paper is that for any fixed p the adjacency matrix of the Erdős-Rényi graph G(n,p) is almost surely nonsingular. Here we consider the case where p=o(1). The main result is the following behavior: If p>^{c ln n}⁄_n, where c>½, then the non-zero rows of the adjacency matrix are almost surely independent. This is tight, since if p is at most ^{c ln n}⁄_n where c<½, then the adjacency matrix almost surely has pairs of equal nonzero rows. In particular, ^{ln n}⁄_n is a threshold for the entire matrix being non-singular.

The rank of sparse random matrices. Combinatorics, Probability, and Computing. 19 (2010), 321-342 (with V. Vu)

As mentioned above, we know that for p less than ln n/n that the adjacency matrix of G(n,p) is almost surely singular. In this paper we are able to exactly quantify the extent of singularity as p drops below ^{ln n}⁄_n. To be more precise, we consider the following model: Given a fixed (symmetric) matrix W which is nonzero off of the main diagonal, we "sparsify" W by independently replacing each entry by 0 with probability 1-p. As p approaches 0, the sparsified matrix will begin to develop collections of s rows which between them have fewer than s nonzero columns. Such collections will be dependent regardless of the initial matrix W.

One main result of this paper is that such "local" structures of the sparsification process are almost surely the only source of singularity for p not too small. If p is at least ^{ln n}⁄_sn (with s fixed), then with high probability the only dependencies in the rows of the sparsified matrix will come from collections of at most s-1 rows that fail to contain enough nonzero columns.

Balancing gaussian vectors. Israel J. Math. 172 (2009), 145-156

Here we consider the following question: Given a (fixed) norm ||.|| on R^d and vectors x₁, x₂, ... , x_n, what can be said about

min || ± x₁ ± x₂ ± x₃, ... , ± x_n ||,

where the minimum is taken over all 2ⁿ possible sign sequences? Although in the worst case this minimum may be as large as d (as shown by Barány and Grinberg), the bound turns out to be much smaller (actually exponentially small in n) most of the time when the x_i are independent Gaussians.

Bilinear and quadratic variants on the Littlewood-Offord problem. Israel J. Math. 194 (2013), 359-394

Let f(x₁, x₂, ..., x_n) be a polynomial dependent on a large number of independent random variables. A natural question to ask is how dispersed such f become as the number of variables increases. To be more concrete, if we independently set each x_i equal to ± 1, what is the maximum concentration of f on a single value, and what polynomials come close to achieving this concentration?

If f is a linear form, we get a rescaled version of a problem originally asked by Littlewood and Offord and answered by Erdős: The largest concentration is of order n^-1/2 and occurs when all of the nonzero coefficients of f are equal. Here we consider the case where f is instead a bilinear or quadratic form.

In the bilinear case, we show that every form with concentration significantly larger than n^-1 must differ in only a few coefficients from a form which factors as f(x,y)=g(x)h(y) (such degenerate forms can have concentration as large as n^-½). In the quadratic case, we show that no form has concentration significantly larger than n^-½. In both cases these results are nearly tight.

A weaker version of the bound for quadratic forms was used in the random matrix papers above, where it was used to bound the probability that the determinant of a random symmetric matrix (which is a quadratic form in the entries in any particular row or column) was equal to 0. An immediate corollary of the results here is a slight improvement in the singularity bounds in those papers.

Concentration of random determinants and permanent estimators Siam J. Discrete Math 23 (2009), 1356-1371 (joint with Van Vu)

Let A be a matrix whose entries are independent random variables having variances bounded above and below by a constant. Here we are interested in the following question: How closely does |det(A)| typically lie to its expectation? There are two natural ways to bound this. One approach (used by Girko and Tao-Vu) is to use the determinant's definition as an oriented volume to write

|det(A)|=d₁d₂...d_n,

where d_i is the distance from row i to the span of the prior i-1 rows, then use probabilistic techniques (e.g. Talagrand's inequality) to show that most of the d_i lie close to their expectation. While this works well when the entries are iid (or even just have equal variances), in the general case this becomes more difficult since the distribution of d_i depends heavily on the structure of the subspace spanned by the first i-1 rows.

Instead we follow the ideas of Friedland, Rider and Zeitouni by considering the determinant as a product of singular values,

|det(A)|=σ₁σ₂...σ_n,

and using results of Guionnet and Zeitouni stating that each σ_i is concentrated. The catch is that the determinant is very sensitive to perturbations in the smallest singular values, which need to be handled separately. Our key improvement over the Friedland-Rider-Zeitouni result lies in our handling of these small singular values (adapting recent results of Tao-Vu and Rudelson-Vershynin along with some geometric arguments), which enables both an improved bound on the concentration and the possibility of entries having non-Gaussian (e.g. discrete) probability distributions.

Finding patterns avoiding many monochromatic constellations. Experimental Mathematics 19 (2010), 399-411 (with Steve Butler and Ron Graham).

Here we consider the following problem: Given an arithmetic configuration which is invariant under scaling and translations (e.g. three term arithmetic progressions, or solutions to 2x+3y=5z), we want to color the integers from 1 to a large n in such a way as to avoid monochromatic copies of the configuration. By Van der Waerden's theorem we know that we cannot avoid them entirely, so we instead focus on creating as few as possible. Here we outline a method of experimentally determining the likely optimal coloring by a combination of perturbation and local optimization (unfortunately we can only guarantee local minimality as opposed to global). We also show that for three point configurations that the random coloring is never asymptotically optimal.

On randomizing two derandomized greedy algorithms. Journal of Combinatorics 1(2010), 265-284 (with Asaf Shapira and Prasad Tetali)

Many of the simplest (and easiest implemented) approximation algorithms can be thought of as greedy derandomizations of simple random algorithms. Here we consider two such approximation algorithms, and consider the question of whether running the algorithms on a random permutation of the input variables can improve their performance ratio

For Johnson's approximation algorithm for MAX-SAT (which was shown by Chen, Friesen, and Zheng to have performance ratio 2/3 in the worst case), we show that randomization provides a non-trivial improvement -- the worst case expected performance ratio improves to 2/3+c for some c>0.

For the Greedy Algorithm for MAX-CUT, we show by contrast that the additional randomization does not provide a substantial improvement. In particular, there are bipartite graphs for which randomized greedy MAX-CUT fails to cut a 1/2+c fraction of the edges for any c. This provides a counterpoint to results of Mathieu and Schudy showing that randomized greedy provides a 1-ε approximation for dense graphs, and answers a question of theirs in the negative.

An extended abstract of this paper appeared in the Proceedings of the Symposium on Discrete Algorithms (SODA) 2011.

Stochastic Matching With Commitment (with Prasad Tetali and Pushkar Tripathi)

Here we consider the problem of trying to find a large matching in an unknown graph. At each step we can choose two as-yet unmatched vertices and try to pair them together, removing both from the graph if a match is found. The catch is that without knowing the graph in advance, it's impossible to always find the largest possible matching (for example, by bad luck we might match the middle edge of a three-edge path instead of matching the two outside edges.

In our case, the graph in question is an inhomogenous Erdős-Rényi graph, meaning that every edge appears with a (possibly different) known probability. The question then becomes how to exploit this knowledge to develop an algorithm letting us match as many edges (on average) as possible

An Extended Abstract of this paper appears in the proceedings of ICALP 2012 ( Lecture Notes in Computer Science 7391, Springer, pg. 822-833)

Information Gathering in Ad-Hoc Radio Networks with Tree Topology (with Marek Chrobak, Leszek Gasieniec, and Dariusz Kowalski) Information and Computation 258 (2018) 1-27

Here we consider several questions related to the following (simplified) model of radio transmission: Nodes are arranged in a tree whose structure is unknown to us. Each node has a message, and is trying to pass the message up to the root of the tree. The catch is that if two children try and send messages to their parents at the same time, the messages become garbled and the parent understands neither. Furthermore, communication only goes in one direction: A node sending a message to a parent has no idea whether or not the parent received the message or what other nodes are trying to send to the same parent. In the paper, we devise algorithms which guarantee all the messages reach the root in a specified time under various assumptions on how nodes can store/transmit messages (e.g. depending on whether nodes can aggregate messages or not, or if nodes have the memory to store messages and pass them on later). In several cases, we also provide matching (within a constant factor, in one case even up to the correct constant) lower bounds guaranteeing that any significantly faster algorithm must fail on some tree. (An Extended Abstract of this paper appears in the proceedings of COCOA 2014 ( Lecture Notes in Computer Science 8881, Springer, pg. 129-145).)

Faster Information Gathering in Ad-Hoc Radio Tree Networks (With Marek Chrobak), Algorithmica 80(3) (2018), 1013-1040

Here we consider two questions related to the radio transmission model discussed in the previous paper.

(1) Suppose that we are not allowed to aggregate messages (i.e. that a node holding several messages must transmit them one-by-one). In this model we give an algorithm that runs in time proportional to n log log n, and show it works on all trees on n vertices. This is very close to the (trivial) lower bound of n that comes from the observation that each message must hit the root individually.

(2) Suppose further that parents are able to acknowledge successful transmissions (i.e. that each time a node tries and transmit a message, it knows whether the message reached its parent or not). In this case we can remove the log log n from the above bound, giving an algorithm that runs within a constant factor of optimal on all trees.

(An Extended Abstract of this paper appears in the proceedings of LATIN 2016 ( Lecture Notes in Computer Science 9644, Springer, pg. 275-289).)

On the Number of Integral Graphs (With Parker Williams), Linear Algebra and its Applications 493 (2016), 447-454.

The integral graphs of the title are are graphs where every eigenvalue of the adjacency matrix is an integer. A common theme running through many of the above papers is that if you take any fixed integer, it is not likely to be the eigenvalue of the adjacency matrix of a graph. So it should be far less likely for every eigenvalue of a graph to be an integer.

In this paper we confirm this, showing that the probability a random graph is integral is at most exp(-C n^(3/2) ) for some absolute constant C. This strengthens a previous result of Ahmadi, Alon, Blake, and Shparlinski that gave an exponential upper bound on the probability.

The bounds in this paper are likely far from optimal (the correct exponent should be proportional to n^2), and improving them seems like an interesting problem