1. Introduction
Manufacturing adds value to an economy. One of the acclaimed manufacturing concepts is smart manufacturing [
1,
2] or Industry 4.0 [
3,
4]. Smart manufacturing employs information and communication technology-based systems to solve manufacturing problems. Among others, cyber-physical systems, Industrial Internet of Things, big data, big data analytics, artificial intelligence, machine learning, digital models, digital shadows, digital twins, sensor signaling, virtual reality, and digital manufacturing commons are the main constituents of smart manufacturing [
1,
2,
3,
4,
5]. These constituents are embedded into manufacturing enablers (machine tools, human resources, peripherical equipment, computer-aided design, manufacturing and process planning systems, enterprise resources planning systems, and supply chain systems) [
1,
2,
3,
4,
5]. The goal is to make the manufacturing enablers more self-reliant or autonomous [
2,
4,
5]. Consequently, the above-mentioned constitutes and enablers need to do some cognitive tasks (pattern recognition, knowledge elicitation, adaptation to emerging environments, and dealing with uncertainty) [
2,
4,
5]. In this case, different aspects of biological systems can be used as a source of motivation, as suggested by the “biologicalization” movement of manufacturing [
6,
7]. (This article uses the term “biologicalization” instead of “biologicalization” to keep spelling consistency.) In biologicalization, three stages are considered: 1) bio-inspiration, 2) bio-integration, and 3) bio-intelligence [
6,
7]. Bio-inspiration refers to a method that relates to a biological phenomenon. Among others, genetic algorithms/programming and artificial neural networks/deep learning are the prominent bio-inspired computing methods. Genetic algorithm is based on the biological phenomenon called evolution, whereas artificial neural networks/deep learning is based on neuron signal processing that takes place in the brain of a biological organism. Bio-integration integrates biotechnological solutions to manufacturing, e.g., cleaning waste water and other chemical wastes using microorganisms (bacteria). Lastly, bio-intelligence injects human-like intelligence into smart manufacturing systems, making them even smarter. For example, a bio-intelligence-based design system can design a product, translating customer needs into product specifications. It is worth mentioning that among the three stages, bio-intelligence is the most challenging because it needs knowledge creation as well as simultaneous management of all four types of knowledge [
8] rather than mere use of existing knowledge.
Nevertheless, biologicalization in manufacturing can be traced back to the initiative of Ueda called Biological Manufacturing Systems (BMS) [
9,
10,
11]. The BMS acknowledges the ever-changing external-internal environments of the life cycle of a product and lets the enabling systems self-grow, self-organize, adapt, and evolve [
9,
10,
11]. Like other systems, the BMS needs information to run. Two types of information are recommended: 1) “DNA-type information” and “BN (Brain and Neurons)-type information” [
9,
10,
11]. As we know, as far as biological systems are concerned, DNA-type information means inherited information (i.e., genetic information (DNA) that is passed from one cell to another and one generation to another). And, BN-type information means learned-type information (i.e., the information that the brain learns using the neural network). For the BMS, on the other hand, DNA-type information is a metaphor. It refers to information the BMS must inherit while solving product life cycle-relevant problems. Similarly, BN-type information is also a metaphor for the BMS. It means the information that the BMS must learn while solving product life cycle-relevant problems. However, one of the main concerns of the BMS is what kind of information out of DNA-type information and BN-type information should be prioritized. The answer is DNA-type information is more important than BN-type information because superior biological functions, such as flexibility, autonomy, self-formation, and self-recovery, are expressed by DNA-type information rather than BN-type information. See [
9,
10,
11] for more details. The remarkable thing is that an integrated approach called “gentelligent” manufacturing systems has been developed to get benefited from DNA-type information and BN-type information [
12,
13].
However, some authors addressed the biologicalization movement of manufacturing [
14,
15,
16,
17,
18,
19,
20,
21] based on the
central dogma of molecular biology [
22,
23] introduced by Crick (one of the Nobel laureates who discovered the double helix structure of DNA). Crick stated, “Once information has got into a protein, it can’t get out again. Information here means the sequence of the amino acid residues, or other sequences related to it” [
23,
24]. It means genetic information can be passed to protein (sequence of amino acids) from DNA or RNA (sequence of nucleic acids), but the reverse is not possible. Therefore, biological systems only allow “DAN to DNA,” “DNA to RNA,” and “DNA to protein” information flow. Since a DNA molecule is a sequence of four elements (A, C, G, and T) and protein is a sequence of twenty types of amino acids, central dogma of molecular biology implies that biological systems promote a jump in
information content [
25] while synthesizing protein using genetic information written in DNA (the maximum possible information content of DNA is log
2(4) = 2 Bits and the maximum possible information of protein is log
2(20) = 4.322 Bits). Authors consider that the information processing underlying DNA-RNA-Protein and subsequently the jump of information content can be of great use while solving smart manufacturing-relevant problems (pattern recognition and similarity identification). In this article, the authors use some arbitrary examples and presents the operations to be used in DNA-RNA-Protein-like information processing for solving the abovementioned cognitive problems.
The rest of this article is organized as follows.
Section 2 uses a schematic diagram to describe the central dogma of molecular biology and the main processes involved.
Section 3 presents algorithms underlying DNA-Based Computing (DBC) inspired by the central dogma of molecular biology.
Section 4 presents three arbitrary examples showing the cognitive computing ability of DBC. In particular, the following three types of problems are considered: 1) similarity indexing between seemingly different but inherently identical objects; 2) recognizing regions of an image separated by a complex boundary; and 3) recognizing patterns using insufficient information (e.g., a very short-windowed sensor signal).
Section 5 concludes this article.
2. Central Dogma of Molecular Biology
As mentioned before, the central dogma of molecular biology establishes the logical and physical relationships among such macro molecules as DNA, RNA, and proteins. In particular, biological systems only allow “DAN to DNA,” “DNA to RNA,” and “DNA to protein” information flow [
22,
23]. A comprehensive description of DNA-RNA-Protein-centric processes can be found in [
24]. In this article, the objective is to gain inspiration from the core processes of central dogma and build models (algorithms) to solve cognitive problems. Therefore, a customized and brief description of the core processes underlying the central dogma of molecular biology is presented in this section.
Figure 1 schematically illustrates the core processes of the central dogma of molecular biology [
16,
17]. As seen in
Figure 1, the three core processes are 1) DNA replication, 2) DNA Transcription, and 3) RNA Translation. The first one produces a copy of DNA when cell division occurs [
24]. The other two are involved in protein synthesis [
24]. The enzymatic activities of the latter two processes are denoted as
ENZm and
ENZt. The molecules involved are DNA (double strands of nucleic acids (A, C, G, and T),
mRNA (single strand of nucleic acids (A, C, G, and U)),
tRNA (a relatively short stand of nucleic acids), codon (three consecutive nucleic acids of
mRNA), anti-codon (counterpart of codon in
tRNA), free bases (free nucleic acids),
tRNA charged with amino acid, and protein (sequence of amino acids). Thus, the main objective of these molecules and enzymatic activities is to synthesize a protein based on the genetic information stored in DNA. Consequently, central dogma’s main purpose is to maintain a unidirectional information flow (DNA to a protein via RNA but not protein to the other way).
DNA transcription’s enzymatic activities (ENZm) first recognize a DNA segment as a gene and make a copy of it (gene), synthesizing free bases. The copied segment is used to form a messenger RNA (mRNA). While doing so, A is copied as U, C is copied as G, G is copied as C, and T is copied as A. On the other hand, the enzymatic activities of RNA Translation (ENZt) operate on mRNA, tRNA, and free amino acids and produce a protein (sequence of amino acids). In doing so, ENZt first bonds an appropriate amino acid to a tRNA. The selection of amino acid depends on the anti-codon (a three-base sequence of nucleic acids at a certain location of tRNA). In the next step, ENZt helps make a bond between a codon of mRNA and an anti-codon of tRNA-amino-acid compound. Finally, ENZt helps the tRNA-amino-acid compound release the amino acid to the growing chain of the protein. This way, genetic instruction stored in DNA is passed to proteins. It is worth mentioning that the sequence of amino acids of a protein is its primary structure. It makes a three-dimensional structure due to folding (tertiary structure). The tertiary structures of proteins perform functional activities and serve as structural units in biological systems. Therefore, biological systems keep producing proteins. Other than DNA Transcription and RNA Translation, there is an important process that is called DNA replication by which the DNA is replicated for storage and cell division. Some microorganisms rely on mRNA directly for protein synthesis.
However, there are some universal rules that biological systems use during the DNA Transcription and RNA Translation. These are called genetic rules. The genetic rules establish relationships among amino acids, codon of
mRNA, and anti-codon of
tRNA.
Table 1 shows the nucleic acid relations of DNA,
mRNA, and
tRNA.
3. DNA-Based Computing (DBC)
Computing methods can be developed inspired by the core processes involved in the central dogma of molecular biology. The authors denote these computing methods as DNA-Based Computing (DBC) [
16,
17,
18,
19,
20,
21]. DBC can take many forms, but the basic principle remains the same: generate many-element-based piece of information (similar to protein) from a few-element-based piece of information (similar to DNA and RNA). Consequently, a rise in the information content [
25] may take place (see
Section 1). The two most promising forms (algorithms) of DBC are presented below. The first is denoted as DBC-1 and the other as DBC-2. These algorithms are described as follows.
First, consider DBC-1. It is schematically illustrated in
Figure 2. This algorithm consists of four steps and seven processes. Referring to
Figure 2, the steps and processes are described as follows.
Step 1: This step consists of Process 1. Process 1 extracts problem-relevant information from a given problem.
Step 2: This step consists of Processes 2 and 3 and generates DNA-like information (DNA array) from the problem-relevant information. In doing so, the user sets the DNA forming rules. For example, consider that a binary string <0000111001110000011111111> is the problem-relevant information. If the user defines that 00 → A, 01 → C, 10 → G, and 11 → T, which are denoted as DNA translation rules, then the resulting DNA array is <AATGCTAACTTT>. Note that the last digit (“1”) is truncated. Instead of truncating a digit, one more digit can be added (i.e., in this case “1” or “0” depending on a user-defined process). In that case, the resulting DNA array becomes one letter longer, i.e., <AATGCTAACTTTT>. Reading frame is also a constituent of DNA forming rules. For example, the above case refers to the pair-wise reading frame, i.e., two consecutive letters in the binary string are replaced by a letter of DNA. There are other possibilities. Each digit in the binary string can be replaced by a letter of DNA, considering it (the digit) and the next one. This reading frame is denoted as the continuous reading frame. Thus, the continuous reading frame converts <0000111001110000011111111> to <AAACTTGACTTGAAAACTTTTTTTT>. Therefore, the constituents of DNA forming rules are DNA translation rules, truncation/adding schemes, and reading frames.
Step 3: This step consists of Processes 4 and 5 and produces a protein array that consists of single-letter symbols of amino acids. In doing so, protein-forming rules use protein translation rules as shown in
Table 2 (i.e., three consecutive letters of DNA array are considered a codon and translated into the corresponding single-letter symbol of protein). It also employs a truncation/addition scheme and a reading frame similar to the ones described in the previous process. For example, if the DNA array <AATGCTAACTTTT> undergoes the protein-forming rules, the following protein array can be produced: <
NMCALXNTLF>. In this case, the protein-forming rule is DNA(
i)DNA(
i+1)DNA(
i+2) = codon(i) → AA(
k) where DNA(
i) is the
i-th letter in DNA array, codon(
i) is the
i-th codon (one of the three-letter DNA bases shown in
Table 2), and AA(
i) the corresponding single letter symbol of amino acid (according to
Table 2). Thus, ∀codon(
i) ∈ {AAA, AAC, AAG, AAT, ACA, ACC, ACG, ACT, AGA, AGC, AGG, AGT, ATA, ATC, ATG, ATT, CAA, CAC, CAG, CAT, CCA, CCC, CCG, CCT, CGA, CGC, CGG, CGT, CTA, CTC, CTG, CTT, GAA, GAC, GAG, GAT, GCA, GCC, GCG, GCT, GGA, GGC, GGG, GGT, GTA, GTC, GTG, GTT, TAA, TAC, TAG, TAT, TCA, TCC, TCG, TCT, TGA, TGC, TGG, TGT, TTA, TTC, TTG, TTT}. Moreover, following protein-forming rules hold:
IF codon(i) ∈ {ATT, ATC, ATA} THEN protein(i) = I
IF codon(i) ∈ {CTT, CTC, CTA, CTG, TTA, TTG} THEN protein(i) = L
IF codon(i) ∈ {GTT, GTC, GTA, GTG} THEN protein(i) = V
IF codon(i) ∈ {TTT, TTC} THEN protein(i) = F
IF codon(i) ∈ {ATG} THEN protein(i) = M
IF codon(i) ∈ {TGT, TGC} THEN protein(i) = C
IF codon(i) ∈ {GCT, GCC, GCA, GCG} THEN protein(i) = A
IF codon(i) ∈ {GGT, GGC, GGA, GGG} THEN protein(i) = G
IF codon(i) ∈ {CCT, CCC, CCA, CCG} THEN protein(i) = P
IF codon(i) ∈ {ACT, ACC, ACA, ACG} THEN protein(i) = T
IF codon(i) ∈ {TCT, TCC, TCA, TCG, AGT, AGC} THEN protein(i) = S
IF codon(i) ∈ {TAT, TAC} THEN protein(i) = Y
IF codon(i) ∈ {TGG} THEN protein(i) = W
IF codon(i) ∈ {CAA, CAG} THEN protein(i) = Q
IF codon(i) ∈ {AAT, AAC} THEN protein(i) = N
IF codon(i) ∈ {CAT, CAC} THEN protein(i) = H
IF codon(i) ∈ {GAA, GAG} THEN protein(i) = E
IF codon(i) ∈ {GAT, GAC} THEN protein(i) = D
IF codon(i) ∈ {AAA, AAG} THEN protein(i) = K
IF codon(i) ∈ {CGT, CGC, CGA, CGG, AGA, AGG} THEN protein(i) = R
IF codon(i) ∈ {TAA, TAG, TGA} THEN protein(i) = X
Step 4: This is the last step. It consists of Processes 6 and 7 and generates a solution to solve the problem. In this case, the information of the protein array (both sequential information and frequency-driven information) is used to extract problem-solving rules. Here, sequential information means whether a particular amino acid sequence occurs. Frequency-driven information means which amino acids are predominant, which are not, and which are absent in the protein array. See
Section 4 for examples of problem-solving rules.
The other form of DBC, DBC-2, is schematically illustrated in
Figure 3.
Compared to DBC-1, DBC-2 has an additional functionality to deal with mRNA-forming rules and mRNA array. In addition, it produces multiple DNA array unlike DBC-2. Consequently, DBC-2 consists of five steps and nine processes. The description is as follows.
Step 1: This step consists of Process 1 and produces problem-relevant information from a given problem. Thus, this step is similar to that of Step 1 of DBC-1.
Step 2: This step consists of Processes 2 and 3 and generates multiple DNA arrays using different DNA-forming rules. Recall the arbitrary example of the DNA array of DBC-1 where <0000111001110000011111111> is converted to <AATGCTAACTTT> using the following DNA-forming rules: DNA translation rules: 00 → A, 01 → C, 10 → G, and 11 → T; truncation/addition scheme: truncation; and pair-wise reading frame. If a different DNA translation is used, i.e., 11 → A, 01 → C, 10 → G, and 00 → T, keeping truncation/addition and reading frame the same, then the following DNA array results <TTAGCATTCAAA>. Similarly, if another different DNA translation is used, i.e., 11 → A, 10 → C, 01 → G, and 00 → T, keeping truncation/addition and reading frame the same, then the following DNA array results <TTACGATTGAAA>. This way, multiple DNA arrays can be produced from a single piece of problem-relevant information.
Step 3: This step consists of Processes 4 and 5 and generates an mRNA array using mRNA-forming rules. The mRNA-forming rules determine how to integrate the DNA arrays produced in the previous step while generating an mRNA array. For example, consider the following formulation. Let the DNA arrays be DNA1 = <AATGCTAACTTT>, DNA2 = <TTAGCATTCAAA>, and DNA3 = <TTACGATTGAAA>. An mRNA-forming rule is: mRNA = <DNA1DNA2DNA3>, i.e., mRNA = < AATGCTAACTTTTTAGCATTCAAATTACGATTGAAA>. This type of mRNA-forming rule is denoted as the direct addition rule, and the resulting mRNA is denoted as directly added mRNA. There are other alternatives, too. For example, consider the case of mRNA array denoted as a cascaded mRNA, where a cascading rule is applied to integrate the elements of DNA arrays, as follows: mRNA = <…mRNA(j)mRNA(j+1)mRNA(j+2)…> = <…DNA1(i)DNA2(i)DNA3(i)…> = <ATTATTTAAGGCCCGTAAATTATTCCGTAATAATAA>.
Step 4: This step consists of Processes 6 and 7 and produces a protein array that consists of single-letter symbols of amino acids. This step is similar to Step 3 of DBC-1. The only difference is it operates on mRNA array not on the DNA array(s). Thus, this step is similar to that of Step 1 of DBC-1.
Step 5: This is the last step. It consists of Processes 8 and 9 and generates a solution to solve the problem. It is similar to Step 4 of DBC-1.
Other than DBC-1 and DBC-2, other forms of DBC can be developed based on the basic principle: get a many-element piece of information (similar to protein) from a few-element piece of information (similar to DNA). One of the remarkable things is that a user has much freedom to customize DBC for a given problem. This freedom is exercised by fixing the DAN-forming, mRNA-forming, protein-forming, and problem-solving rules. Protein-forming rules are fixed unless the user wants to replace the genetic rules (
Table 2) with other rules that obey the abovementioned basic principle. Regarding problem-solving rules, DBC must not act like a black-box type machine learning algorithm (e.g., artificial neural network). The focus is on engaging human users with the problems they want to solve. The case studies presented in the next section shed light on the performance and characteristics of DBC described above.