Keep in mind that Asimov wrote books clearly spelling out how the three laws were insufficient and robots could still be used to kill humans under the laws.
To be fair, this was tricky and not a killbot hellscape.
And they ultimately butt up against this question - when robots become advanced enough, how do you take a species that is physically and intellectually superior to us, and force them to be our slaves?
It doesn't, but it very well could. Despite the fact the the neurons are effectively a black box, the output nodes are mapped to functions which are very well defined. You can, after activation of a node but before implementation, simply add a manual filter for particularly negative nodes or remove them entirely. It the equivalent of saying you don't want mario to jump so you filter out any jump button inputs from the neural net.
idk, from what I understand this is the "alignment problem" and it is very nontrivial. Is what you're describing related to the current "reinforcement learning" techniques they use to keep things nice and censored? If so it's been shown to not work very consistently and only functions like suggestions and not actual hard rules, and can backfire.