@DoIMustHaveAnUsername?
Thanks for the excellent posts. It was an enjoyable read.
1. I agree that sentience is a degree scale and not a binary state.
2. When I said I used GPT 3 and 4 I obviously meant GPT 2 and 3. I think GPT 4 doesn't exist yet
3. Why, in my opinion, GPT and content trained neural networks won't become AGI:
3.1 They are limited to producing permutations of the input they are trained on. At best they could mimic the behavior of living things, but they lack the systems responsible for independent action or dealing with novelty, because they do not start out as living and learning organisms, only as pattern repeating mechanisms.
3.2 They don't have a direction, goal or will. Rather they choose their action based on the current input plus their network biases. Their network biases represent the most common elements of their training data. Humans have things they perform in almost the same way as other humans, but they also have things they do in very unique ways that vary greatly from the average distribution of all behavior. NN's are great at building the distribution and staying within the average, but there will never be enough training data to teach them how to be an outlier in something.
3.3 They don't really have an internal feedback loop running separately from their perception. I think a successful sentient AI needs both perception of external states in a relative separation with its internal states.
3.4 Their memory is the part of the same neural network responsible for I/O. At best a specialized Language NN can become a part of a set of other NN's that are directed by a central NN that decides which domain-specific NN to use to interface with the domain they're engaging with. Much of its memory is just their history of outputs fed back to itself as its current input which quickly exceeds its memory capacity as the prompts build up.
3.5 The memory and processing time requirements for sufficiently large NN's are already extraordinarily big. It's safe to say that a GPT-3 can be a decent copywriter. To make it an excellent writer it would need better choice of what to write, but aside of that it would probably need an order of magnitude more training which would cost $100 million of compute time. This isn't impossibly large, but if you consider all the possible domains and cross-domain capabilities then the training cost becomes astronomical, not to mention operating a live system that can make real time decisions and keep several exceedingly large NN's running together.
3.6 GPT can paramentrize, if that's what you mean by multi-modal. It can spot elements of text or numbers that it then can replace with other values. Like changing names or variables in code or shifting fragments of software code to fit an answer. That is quite impressive for a dumb program, but unless it can test the output code to see if it works and feedback to itself and add code until it passes the test it can't be considered anything but editing of code or text, rather than writing it.
3.7 We can't escape the necessity to use the physical learning environment or a truthful representation of our physical world. Sure, a Neural Net can train in a simulated reality, but can you imagine the training times for even simple domains and scenarios? Now consider teaching the NN multiple domains and cross domain or solving domains in parallel. It's possible but very compute intensive.
3.8 I think it would be possible to develop a NN like GPT that could satisfy our requirements for sentience, but I think AGI will be created by a different kind of development process. Using a typical NN training approach is quite expensive and I think there are more efficient ways that will be developed in the coming years.
4. Even the tedious approaches like mapping the human brain are at some point going to allow for an emulation of a human mind which can then be used to do work, inhabit machines or develop better brains and AI. A fully synthetic human brain is effectively an AGI. It can run at thousands of times the speed of our brains, live forever and improve on its work infinitely and that's excluding all the modifications and improvements that can be done to it to make it a better intelligence overall. AGI is an inevitability and I can see many approaches leading to an AGI, even the inefficient ones. We can argue when it's going to happen and what technology will be the one to do it.
I would distinguish understanding and sentience. I take a more functionalist approach to understanding, whereas by sentience I consider to be the presence of "phenomenal feel" -- "the what it is like"-stuff (Nagel et al.).
In that sense, I am not entirely sure that phenomenal feel is a matter of degree. It could be a binary matter if one has sentience or not (although of course the "richness" of phenomenology would be a matter of degree).
Following that sense, even bacterias could be sentient for all we know, it may have some very minute "feeling" or sense that "feels like something". This question is more of a matter of metaphysics.
Intelligent behavior, I don't think, is necessarily associated with sentience. It may come together in particular implementations (like humans potentially; and may be even other biological entities in the evolutionary continuum) but I don't think sentience is necessary for intelligence or to implement "understanding" functionally.
So I am bracketing off the question of sentience (I am not sure if it can even be answerable without strong assumptions). Overall, I don't think we should even attempt to make sentient AIs. It's probably ethically cleaner to have non-sentient value-aligned intelligent AIs.
Regarding AGI, the whole thing is not well-defined. One weak-definition of AGI would be just "human-like". But a less anthropocentric approach would constitute trying to understand and defined the essence of intelligence and then think what a general intelligence should constitute.
In the first sense, I agree, NNs are not quite AGI, and I don't think scaling Transformers is the exact way (I mean in a trivial sense, even if we achieve sort of AGI, it wouldn't be quite human-like because of its sample-inefficiency in pre-training and very different context of learning as opposed to evolution). The latter approaches of making less anthropocentric definition of general intelligence gets into controversies, and there isn't a really stable definition or problem to work on.
I am personally skeptical of current Transformers prospects. I think it's more of a patchwork but regarding your points:
3.1 I think we all only produce permutations of what we know at some level. For example all images, symbols, are some permutations of pixels and colors that we come accross early on. The point of intelligence is to make meaningful permutations, generalize systematically and compositionally and so on.
3.2 I don't think that's necessarily true. There are patterns in novelty too. In a simple case, you can potentially train models to create meaningful novel images from random noise by making it learn the patterns behind image-patterns (or music) (there are some constraints to what we enjoy in art; creativity is a matter of randomizing while maintaining the necessarily constraints). Similarly it may learn some level of meta-patterns. There's also a reason why often different people come up with the same "novel idea" independently (Newton and Leibniz's Calculus being one example). The contexts (eg. the scientific literature and such) are often informative in some manner that makes keen people come up with the same "novel" idea. If NNs can learn to model this connection between contexts and new ideas, it can in theory learn to be outliers too. Of course, at this point NNs haven't showed much promise at this, but some specialized training or something may better unlock that ability someday.
3.3 That's just RNNs. You have a hidden state in an internal feedback loop, and a different weight set for perception of the inputs. Transformers often beat RNNs though. Still in more state-based decision problems and in some other contexts, Transformers have been used in more RNN-ish fashions.
3.4 Large Language Models can be in a sense domain-general yet encode lot of domain-specific information and facts in its weights themselves. I think they play around with this in QA. So weights act as a memory too, besides the explicit input prompt. Again it's still more of a patchwork kind of thing, where some memory is encoded in the weights, and there isn't a clean operation in updating memories and such. There are probably some work to do in that. Note that. Lambda and others also can have a retrieval mechanism -- it retrieves information from internet/wikipedia and stuff...so in that sense there is also an interface with a huge external memory.
3.5 Probably. There are already some semi-successful attempts with doing things cross-domain and cross-modal though: for example: GATO:
https://arxiv.org/abs/2205.06175
3.6 By multi-modal I mean incorporating different modalities of data (image, text being the two most popular modalities). There are ongoing progress in program synthesis and stuff though. Program verification and feedback are being incorporated. I think I read some abstracts like that sometime in my google scholar recommendations. Generally, I have stopped keeping track of papers that much these days though.
3.7 Yes, sample efficiency and RL-stuff needs more work.
3.8 Probably. Not necessarily new paradigms, many old paradigms are also underexplored: continual learning, imitation learning, active inference, IRL, RL etc. Although ultimately not many people are really focusing on "AGI". Mostly the focus is on particular problems; and we probably would keep on watching more progress on a bunch of specialists and particular problem solves (eg. NeRF, Stable Diffusion, assisted theorem solving, protien folding, Chatbot (for most business purposes we probably don't need full physics understanding and such from Chatbot beyond it having a way with words) etc.)-- often beating average humans, or sufficiently impressing average humans (chatbots).