The original problem was that the machine started spontaneously suspending due to overheating, and I experimentally found that removing two of the four DIMMs (leaving me with 32G rather than 64G of RAM) was a temporary workaround but not a permanent answer. I have an onsite service contract, but Dell will ship parts for a DIY repair if you so request. They publish the full service manual online as a free download, which is a fantastic resource and not "a given" with other manufacturers.
Everything had gone well until I was almost done. Before closing up the system, I powered up for a pretest of the internal connections. The BIOS began running its initial setup, then the power flickered. I'd seen that symptom before and knew the likely cause. That ribbon cable in the last picture of this album connects the palm rest's power button to the system board, and it's notoriously hard to reach or to see if it's fully seated.
As I suspected, it was not fully seated in the connector, and that's where I made my stupid mistake by trying to reseat it without powering down. As I moved the cable, it scraped against something and tore -- and somehow a power rail touched the metal chassis. There was a tiny curl of smoke as I yanked the power cable. Too late. Electricity is a lot faster than human reflexes. Once you let the smoke out of electronic parts, they stop working. I had bricked my very expensive mobile workstation.
Ashamed but honest, I called Dell Support and owned up to what had happened. To my surprise, they didn't charge me for the repair of the repair. My pro service agreement has a clause covering accidental damage, and this fell into that category. They were even willing to continue just shipping me parts. I told the tech that I had done enough damage for one week, and requested an onsite tech.
The field tech got the system powering up, but it had reverted to the original problem with DIMM sockets. We worked together to diagnose it but concluded either the new motherboard was faulty, or that something was going on with the internal power supply or other "deep chassis" subsystem. On his advice, I agreed to ship it off to the depot rather than continuing to iterate local service. Returned from depot, the machine was bootable and stable but had severe thermal throttling issues. I thought it might a BIOS power setting, but nothing I changed made much difference. Reluctantly, I decided it had to be a bad paste job, which led to this endeavor.
This may come as a surprise, but despite this obvious miserable pasting job, I am overall extremely happy with this computer and with Dell tech support. I bought the system about two years ago, and every interaction with Dell support has been superb other than this one failure. My system had a defective heatsink due to a problem with Dell's upstream supplier, and they covered that under warranty with no quibble. When a cooling fan went bad a year later (which can happen with any brand of computer), they shipped me a new heatsink assembly -- a large and expensive part -- without hesitation for a DIY repair (my choice; they would have sent a tech if I asked). This system is my "daily driver" for work, and it's essential that I have good support on it. Dell has delivered. I'm willing to cut them some slack on one mistake, because we're all human.
The end result is that performance is back where it was before the trouble began, and my CPU and GPU temperatures are down by about 20 C from the Dell paste situation. I'm sure it's partly due to the Arctic 5 Silver being a very good paste, but I firmly believe the depot applied too much paste and didn't place it well. I'm just glad to have my work machine fully functional again.