Preprint
Article

On A Solution of “P vs Np” Millenium Prize Problem Based on the Subset Sum Problem

Altmetrics

Downloads

95

Views

32

Comments

0

Submitted:

12 October 2023

Posted:

13 October 2023

You are already at the latest version

Alerts
Abstract
Given a set of distinct non-negative integers X^n and a target certificate S in parametrized form: ∃X^k⊆X^n,∑_(x_i∈X^k)▒x_i =S (k=|X^k |,n=|X^n |). We present a polynomial solution of the subset sum problem with time complexity T≤O(kn)≤O(n^2) and space complexity S≤O(((n-1)n)/2)≤O(n^2 ), so that P = NP.
Keywords: 
Subject: Computer Science and Mathematics  -   Computer Science

1. Introduction

We consider the subset sum problem: given a set of distinct non-negative integers X n , and a value sum (certificate) S , determine if there is a subset X k   of the given set with a sum equal to the given sum S .
For this problem, exponential [1,2], pseudopolymial [3,4,5,6,7] algorithms and exhaustive search methods based on the “divide and conquer” principle [8] have been developed. The complexity of algorithms was considered in [9,10,11].
It is proved that the subset sum problem belongs to the NP-complete class problems. According to the well-known theorem that if there is a polynomial solution to the NP-complete class problem then P = NP.
The main idea of ​​the proposed approach is to transform the set X n (input data of length n ) into subsets based on the combination function С n k in the form of two-dimensional symmetric matrixes from which the main diagonal and all elements below this diagonal have been removed. Thus, these matrixes are triangular. Since the elements of the set X n have indexes, therefore these matrixes are rewritten with respect to the indexes and the sum of the indexes, which we call index certificates. Index certificate corresponds to the certificate S . It should be noted that the elements along each diagonal of matrixes with index certificates are equal to each other. In this case, the time required for selecting subsets of X k (selected data of length k ) will be proportional to the number of elements in the used diagonal, which are less then С n k . To the best of our knowledge, the proposed polynomial solution with time complexity  and space complexity  is the fastest general algorithm for this problem and is reduced to the problem of X n   elements’ index selection.
The paper is organized as follows:
-
problem statement in parametrized form ; data input methods based on triangle two-dimensional matrixes;
-
given lemma estimates required time T and space S ;
-
additionally, we present algorithms to solve subset sum problem and examples to confirm claimed results.
-
The suggested algorithms can be computed as a standalone, self-contained module (as a software solution for diverse problems) or implemented as hardware chips.

2. Problem statement.

Today we know several subset sum problem’s statements [1,2,12]. We propose a parametrized form in the following statement:
X k X n :   x i X k x i = S
The subset X k can be selected based on the combination function:
С n k = n ! k ! ( n k ) ! = n n 1 n 2 ( n k + 1 ) k !

3. Data input methods.

Initially, let's introduce the subset of partial sums Z l , which is associated with the problem statement(1):
Z l = z 1 , z 2 , , z l , z j = x i X k x i , X k X n , j L = 1,2 , , l   ,   l = С n k ,
and the subset of index combinations:
S l = s 1 , s 2 , , s l , s j = s i N k n i , N k N n , j L = 1,2 , , l   ,   l = С n k ,
where N n is a set of natural numbers, n = N n , k = N k ,   this subset is necessary to ensure correspondence with the subset Z l .
In the following discussion, we presume that these subsets are sorted and can be represented as two-dimensional matrixes. For k = 2     subset Z l can be represented as a triangle two-dimensional matrix of order (n-1)x( n-1):
x 1 + x 2 x 1 + x 3 . . x 1 + x n 1 x 1 + x n x 2 + x 3 x 2 + x 4   x 2 + x n 1 x 2 + x n x n 2   + x n 1 x n 2   + x n x n 1 + x n  
Let's represent matrix(5) with respect to index combinations. It is enough to apply the concatenation operator and attach the number 1 to the elements of the set N n , starting from the second element to the end; then attach the number 2 to the elements of this set, starting from the third element to the end, and so forth until reaching the last element (n-1 n):
12 13 ……… 1 n-1 1 n
23 24 ……2 n-1 2 n
……
n-2 n-1 n-1 n
n-1 n
Thus the subset of index combinations S l can be represented with respect to the partial index certificates s j as a triangle two-dimensional matrix of order (n-1)x( n-1):
1+2 1+3 ……… 1 +n-1 1+ n
2+3 2+4 ……2+ n-1 2 +n
……
n-2 + n-1 n-1+ n
n-1+ n
For k = 3 the subset Z l is represented as a two-dimensional matrix of order (n-2)x( n-2), (n-3)x( n-3), …, (1x1) respectively:
x 1 + x 2 + x 3 x 1 + x 2 + x 4   x 1 + x 2 + x n 1 x 1 + x 2 + x n x 1   + x n 2   + x n 1 x 1   + x n 2   + x n x 1   + x n 1 + x n   - - - - - - - - - - - - - - - - - - - - - - - - - x 2 + x 3 + x 4 x 2 + x 3 + x 5   x 2 + x 3 + x n 1 x 2 + x 3 + x n     x 2 + x n 2   + x n 1 x 2   + x n 2   + x n x 2 + x n 1 + x n   And so forth - - - - - - - - - - - - - - - - - - - - - - - - x n 3 + x n 2 + x n 1 x n 3 + x n 2 + x n x n 3   + x n 1   + x n - - - - - - - - - - - - - - - - - - - - - - - - - - - - - x n 2   + x n 1   + x n
A simple way to construct matrix(8) is to add element x 1   to elements of matrix(5), starting from the second line and until the end; then add element x 2   to the elements of matrix(5), starting from the third line and until the end, and so forth until reaching the last element ( x n 2   + x n 1   + x n ). Number of these matrixes is n 2 . Let’s represent matrixes(8) with respect to index combinations. It is enough to apply the concatenation operator and attach the number 1 to the elements of the second line of matrix(6); then attach the number 2 to the elements of the third line, and so forth until reaching the last element (n-2 n-1 n):
123 124 … ………….12n
……………..
1 n-2 n-1 1 n-2 n
1 n-1 n
-------- ----------------------
234 235………… 23n
………...
2 n-2 n-1 2 n-2 n
2 n-1 n
------------------------------
n-3 n-2 n-1 n-3 n-2 n
n-3 n-1 n
------------------
n-2 n-1 n
Then matrixes(9) can be represented with respect to the partial index certificates s j as a triangle two-dimensional matrix of order (n-2)x( n-2), (n-3)x( n-3), …, (1x1) respectively:
1+2+3 1+2+4 … …………1+2+n
……………..
1 +n-2 +n-1 1 +n-2 +n
1+ n-1+ n
-------- -----------------------------------
2+3+4 2+3+5………… 2+3+n
………...
2 +n-2 +n-1 2 +n-2 +n
2 +n-1 + n
---------------------------------------
n-3+ n-2 +n-1 n-3+ n-2+ n
n-3+ n-1+ n
------------------
n-2+ n-1 + n
A dash separates matrixes from each other.
The proposed method not only allows to construct triangle two-dimensional matrixes for k 4 but also to establish a correspondence between the partial sum z i and the partial index certificate s j (in particular, based on index combination matrixes (6) and (9)):
Definition. The correspondence of two subsets Z l and S l is defined as any subset
F Z l × S l . We write F: Z l S l , and instead of ( z   , s ) ∈ F we write s F(z) or
F : Z l S l ,   s F z .
In other words: if sets A and B are given, then the elements of set A can correspond to any number of elements of set B. Including none. A mapping is referred to as a one-to-one correspondence, i.e., one in which for every 'a,' there exists a unique 'b' so that (a, b) ∈ F.
Index combination matrixes(6) and (9) are indexes for partial sum matrixes(5) and partial index certificate matrixes(7), and indexes for similar matrixes (8), (10) respectively.
Thus, the proposed problem statement(1) is directly related to two-dimensional matrixes(5) and (8), if k = 2 k = 3 . In the case of k 4 , it is possible to construct matrixes similar to matrixes(5) and (8) related to the problem statement(1).
Now we can suggest a novel approach to solve problem(1) and construct a framework to manage set of indexes X n . For this purpose, an auxiliary problem on subsets sum N k N n with k = N k and a given index certificate s k can be parametrized in the following statement:
N k N n :   n i N k n i = s k .  
Auxiliary problem(12) excludes the accuracy parameter p (bit representation of elements x i X n ) from the computational complexity of problem(1), thus facilitates the solution of problem(1). The subsets N k are determined based on the combination function(2). Each subset N k consists of k elements of the set N n .
According to the combination function(2), we have the range:
[ Z m i n k ,   Z m a x k ]
where based on the combination function(2) we can determine Z m i n k = i = 1 k x i   ,   Z m a x k =   i = n k + 1 n x i   ,     x i   X n , and it is assumed that the set X n is sorted in ascending order and elements from the subset Z l   are selected such that:
Z m i n k z j Z m a x k , Z m i n k S Z m a x k
Thus, the range(13) consists of elements z j of the subset Z l   ( z j Z l ) satisfying condition(14). Certificate S may or may not belong to the range(13). If certificate S does not belong to the range(13), then problem(1) has no solution.
In particular, for matrixes(5) and (8) we have Z m i n 2 = x 1 + x 2 , Z m a x 2 = x n 1 + x n , Z m i n 3 = x 1 + x 2 + x 3 , Z m a x 3 = x n 2 + x n 1 + x n respectively.
Let's demonstrate a method for identifying unique index certificates s i k . First, let's determine s m i n k = 1 k n i ,   n i N n , s m a x k = n k + 1 n n i , n i N n . Next establish the potential range for the index certificate s k , corresponding to a specific subset within the set of subsets N k ,
s i k [ s m i n k , s m a x k ] ,
  s m i n k   s k   s m a x k .
Notice that the range(15) describes only unique index certificates s i k . Next, let’s determine number of unique index certificates s i k :
    m k = s m a x k s m i n k + 1 = k n k 1 k 2 k k + 1 2 + 1 = k n k 2 + 1 .
Formula (17) defines the number of unique index certificates s i k ,   i = 1,2 , ,     m k . Let’s note that the sum of all elements of the constructed matrixes is equal to С n k .

4. Method to solve the problem.

Based on the above discussion, we can draw an important conclusion: the quantity     m k   allows us to determine the time required for selecting subsets N k , X k , using only unique index certificates s i k :
    m k С n k .
Inequality(18) shows the high efficiency of the proposed approach in terms of algorithm execution time and required space.
Lemma. Let   s k [ s m i n k ,   s m a x k ] , S [ Z m i n k ,   Z m a x k ] and there is a one-to-one correspondence(11) of the subsets Z l and S l ,   then required time T to select subsets N k , X k and required space S satisfy the conditions:
T O ( k n ) O ( n 2 ) , S O ( ( n 1 ) n 2 )
Proof. The fact that certificate S belongs to the range(13) ensures the fulfillment of inequality(14) and the existence of a partial sum z j such that S = z j . The fact that the index certificate s k is in the range(15) means, that s k satisfies inequality(16) and exists within unique index certificates s i k so, that s i k = s k . The fact that the certificate S is in the range(13) insures the inequality(14) and the existence of a partial sum z j so, that S = z j . These given conditions allow us to find k and construct triangle two-dimensional matrixes . In turn, the pair ( z j ,   s j ) satisfies the correspondence condition(11) between the subsets Z l and S l . In this case there is s j   [ s m i n k ,   s m a x k ] , as s j = s k = s i k . Then based on formula(17) we have
T O m k O k n O n 2 a s m k = k n k 2 + 1 k n n 2 , 1 k n . When recursively storing triangular two-dimensional matrixes, it is sufficient to S O ( ( n 1 ) n 2 ) .
Consequence. When directly using matrixes (5)-(10) with k = 2 , k = 3 we have T O ( n 2 ) , S O ( ( n 1 ) n 2 ) и T O n , S O n 1 n 2 .

5. The subset sum problem algorithms.

With obtained results we can develop algorithms to solve the problem(1) and the auxiliary problem(12).
Subset sum problem algorithm.
Step1. Inputting certificate S and sets X n ,   N n . Step2. Determining k of subsets X k ,     N k from inequalities(14), (16) with respect to certificate S and calculated boundaries Z m i n k ,     Z m a x k ,     s m i n k ,   s m a x k     of ranges(13), (15).
Step3. Constructing of subsets Z l and S l in the form of triangle two-dimensional matrixes similar to matrixes(5) and (8), as well as defining ranges(13) and (15) from elements of subsets Z l , S l , satisfying the inequalities(14) and (16) respectively.
Step4. Checking the existence of element z j = S , z j Z l and defining partial index certificate s j S l based on correspondence(11) within subsets Z l and S l . Step5. Determining whether s k belongs to the range(15) and checking the condition s i k = s j = s k so that s i k belongs to one of the diagonals of the partial index certificate matrixes obtained in Step3, and finding the number of elements of this diagonal.
Step6. Representing of a unique index certificate   s i k found in Step5 as k indexes, similar to elements of matrixes(6), (9).
Step7. Defining subsets X k ,   N k based on indexes found in Step6.
Step8. Calculating number of     m k unique index certificates s i k based on boundaries   s m i n k ,   s m a x k in formula(17).
Step9. Calculating required time T and required space S to select subsets based on forlmula: T O ( k n ) O ( n 2 ) , S O ( ( n 1 ) n 2 ) .
Step10. Outputting X k ,   T ,   S . Auxiliary problem(12) algorithm.
Step1. Inputting k , certificate   s k and set N n . Step2. Calculating boundaries   s m i n k ,     s m a x k of the range(15).
Step3. Constructing of subsets S l in the form of triangle two-dimensional matrixes, similar to matrixes(7) and (10), defining the range(15) from elements s j S l satisfying inequality(16) (which is not obligatory).
Step4. Obtaining partial index certificate s j   [ s m i n k ,   s m a x k ] when s j S l and s j = s k and checking the inequality(16) for the index cesrtificate s k . Step4. Obtaining partial index certificate s j S l from the condition s j = s k and checking the inequality(16).
Step5. Determining whether unique index certificate s i k = s j = s k   belongs to one of the diagonals of the partial index certificate matrixes obtained in Step3, and finding the number of elements of this diagonal to define required time to select subsets X k ,   N k . Step6. Representing of a unique index certificate   s i k found in Step5 as k indexes, similar to elements of matrixes(6), (9).
Step7. Defining subsets N k based on indexes found in Step6.
Step8. Calculating number of     m k unique index certificates s i k based on boundaries   s m i n k ,   s m a x k in formula(17).
Step9. Calculating required time T and required space S to select subsets N k based on formula: T O ( k n ) O ( n 2 ) , S O ( ( n 1 ) n 2 ) .
Step10. Outputting N k ,   T ,   S . Examples are provided to confirm claimed results and to show the possibility of applying algorithms in a compressed form:
Given set X 8 = 10,14,17,20,36,38,43,47 . Example1. Let S = 53 . Matrix(5) is constructed as follows:
24 27 30 46 48 53 57
31 34 50 52 57 61
37 53 55 60 64
56 58 63 67
74 79 83
81 85
90
Matrix(6) is constructed as follows:
12 13 14 15 16 17 18
23 24 25 26 27 28
34 35 36 37 38
45 46 47 48
56 57 58
67 68
78
Matrix(7) is constructed as follows:
3 4 5 6 7 8 9
5 6 7 8 9 10
7 8 9 10 11
9 10 11 12
11 12 13
13 14
15
From the condition(11) of correspondence S and s k we have that for   S = 53 there is s 2 = 8   as per matrixes(19) and (21). Let's decompose s 2 into indexes and find subsets N 2 = { 1,7 } 2,6 3,5 . Then possible subsets are X 2 = { x 1 , x 7 } x 2 , x 6 x 3 , x 5 . To the given certificate S there are satisfying subsets X 2 = { x 1 , x 7 } x 3 , x 5 , T O 13 O 16 O 64 , S O 28 . Example2. Let   S = 120 ,   s 4 = 20 .   Subtract from 120-53=67. For the certificate S = 67 there is s 2 = 12 as per matrixes(19) and (21). Subsets N 2 = 4,8 5,7 . Then X 2 = x 4 , x 8 x 5 , x 7 . As per example1 we obtain X 4 = { 10,43,20,47}={ x 1 , x 7 , x 4 , x 8 }   X 4 = 17,36,20,47 = x 3 , x 5 , x 4 , x 8 ,   T O 13 O 16 O 64 , S O 28 .

6. Conclusions and future work.

The article provides lemma to estimate required time T and space S and algorithms to solve the subset sum problem. On the basis of the results obtained, an initial set indexes’ management framework has been developed. The proposed algorithms can be easily implemented as software and/or hardware solution in a variety of applications including: scheduling[13], queries in databases[14], graph problems[15] and others.
In a view of the fact that the linear (or quadratic) solvability of the subset sum problem from the NP-complete class is proved, therefore, based on the well-known theorem (stating that if some NP-complete problem is solvable in polynomial time, then P = NP), the equality of classes P and NP is claimed.
-
Further work directions will be focused on:
-
partition of an initial set into subsets (using Vandermonde’s convolution and symmetry property);
-
applying combination function properties;
-
optimization of combination algorithms;
-
calculating process paralleling etc.

References

  1. E. Horowitz, S. Sanni. Computing Partitions with Application to the Knapsack Problem //Journal of the ACM(JACM), 1974, T21, pp.277-292. [CrossRef]
  2. R. Schroeppel, A. Shamir A T=O(2n/2), S=O(2n/4) Algorithm for Certain NP-Complete Problem // SIAM Journal on Computing, 1981, Vol.10, № 3, pp.456-464. [CrossRef]
  3. Richard Bellman. Notes on the theory of dynamic programming iv - maximization over discrete sets. Naval Research Logistics Quarterly, 3(1-2):67–70, 1956. [CrossRef]
  4. David Pisinger. Linear time algorithms for knapsack problems with bounded weights. //Journal of Algorithms, 33(1):1 – 14, 1999. [CrossRef]
  5. Konstantinos Koiliaris, Chao Xu. A Faster pseudopolynomial time algorithm for subset sum. To appear in SODA ’17, 2017. //arXiv:1610.04712v2[cs.Ds] 8 Jan 2017.-18p. [CrossRef]
  6. Karl Bringmann. A near-linear pseudopolynomial time algorithm for subset sum. To appear in SODA ’17, 2017. //arXiv:1610.04712v2[cs.Ds] 8 Jan 2017.-18p. [CrossRef]
  7. A Lincoln, VV Williams, JR Wang, RR Williams. Deterministic Time-Space Tradeoffs for k-SUM //arXiv preprint arXiv:1605.07285 ,. [CrossRef]
  8. N. Wirth. Algorithms and Data Structures. Russian translate – М.: Mir, 2006.
  9. R. M. Karp. Reducibility among combinatorial problems. Springer, 1972.
  10. Cobham A. The intrinsic computational difficulty of functions. //In Proceedinges of the Congress for logic, methodology and philosophy of science.-NorthHoiLand, 1964.P.24-30. [CrossRef]
  11. Egmonds J. Parths, treers and flowers. // Canadian Journal of mathematics. -1965. Vol.17. –P.449-467. [CrossRef]
  12. R.M. Kolpakov, M.A. Posypkin. An upper bound for the number of branches for the subset sum problem. //Math. Issues of cybernetics. Issue 18.-M.: Fizmatlit, 2013.-pp.213-226.
  13. Xiangtong Qi. Coordinated logistics scheduling for in-house production and outsourcing. //Automation Science and Engineering, IEEE Transactions on, 5(1):188–192, Jan 2008. [CrossRef]
  14. Quoc Trung Tran, Chee-Yong Chan, and Guoping Wang. //Evaluation of set-based queries with aggregation constraints. In Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, October 24-28, 2011, pages 1495–1504, 2011. [CrossRef]
  15. Venkatesan Guruswami, Yury Makarychev, Prasad Raghavendra, David Steurer, and Yuan Zhou. Finding almost-perfect graph bisections. //In Innovations in Computer Science - ICS 2010, Tsinghua University, Beijing, China, January 7-9, 2011. Proceedings, pages 321–337, 2011.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated